At Freakalytics, we've used the D3 data visualization library on several client projects and have been impressed with the nearly infinite set of graphing, charting and mapping possibilities. Unfortunately, we were less impressed with the high learning curve, level of effort and complexity involved in developing and customizing the desired visualizations.
Perhaps you have seen D3 in the New York Times? D3 examples like those in the New York Times are typically made by teams with expertise in D3 and related web technologies. Now, forward-leaning visual analytics companies like Qlik are opening their API to work in harmony with the wide range of D3 visualizations.
Now, the really good news! An open source effort at the Data Lab of The University of Washington has created Lyra, a point-and-click editor for creating D3 visualizations. We've used it and were impressed with it, so we wanted to share it with you as a learning resource or even a productivity tool. Keep in mind that Lyra is still experimental and requires some effort on your part to properly embed it in your work. The UW Data Labs has created some nice videos, tutorials, examples, a Wiki and a discussion group for your learning benefit.
One of the examples posted by the Data Lab is a classic data visualization piece, Napoleon's March.
Stephen McDaniel and Eileen McDaniel, Ph.D.
Topics: Data Analysis, Visual Analytics and Business Intelligence
This was originally published in the
TDWI FlashPoint Newsletter in August of 2014
Italicized sections, images and their captions were not part of the TDWI version.
Until recently, visual analytics was considered a niche area. Those days are quickly passing; almost every major analytics and BI vendor is either launching or developing a product focused on visual analytics.
As a data professional, you’ll face challenges in integrating these products into your existing BI infrastructure. How can you successfully implement new visual analytics tools and keep your business customers engaged and happy? Step in front of the inevitable progression to visual analytics by crafting a winning data strategy.
We suggest starting in these three areas:
1. Learn the basics of the visual analytics tools used by the business analysts in your organization. Follow the process of how a real-world project is executed. Solving a typical business problem will give you a chance to experience firsthand what users are doing. You will be surprised at how the tools change your view of the data warehouse and “proper” data structures.
We’ve had many data professionals attend our analytics workshops. Even those with years of experience in the field tell us that managing code and databases is a completely different way of thinking about data compared to analyzing an issue, which has drastically different constraints and goals. Investigating a real problem that the business is facing should help you to see many possible ways that your data stores can be adjusted to enable successful analysis.
Check out our book on the principles of visual analytics, grounded in the scientific method, The Accidental Analyst. Stephen Few called it, "... awonderful book, filled with practical advice."
Tools to consider include Qlik Sense, SAP Lumira, Tibco Spotfire, Tableau and Microstrategy Analytics Desktop. We have successfully used all of these tools in our work with various clients. They all have differing strengths, workflows and design philosophies. Read more about these products and others in our Candid Quadrants report.
2. Find an ally in each of your key business areas, preferably one that is an expert analyst for a viewpoint from “the other side.” Leverage these analysts for invaluable knowledge to design better data structures in the form of tables, graphs, and system maps in your data systems. This is far more effective than decoding the whole process by yourself. When building data warehouses and downstream analytic data stores, we’ve discovered that expert analysts are often excited and motivated to collaborate on improving the efficiency and value of the data sources in their analyses.
3. Commit to the reality that self-service data management with desktop spreadsheets and databases among business users is not going away. Instead, it will only continue to accelerate over the next few years. Part of this reality is driven by the fact that the appropriate data structure is often dependent on the analysis problem at hand. Another reason driving this growth is that more data streams are flowing into organizations, often at a rate that is overwhelming for analysts and data teams alike.
In our experience, when we help business users improve their data management skills, they are less likely to make mistakes or inaccurate assumptions about the data. They also better comprehend and appreciate the hard work involved in maintaining central systems.
Seize the opportunity to be more successful in your career as a data professional by understanding and incorporating the new landscape of BI and visual analytics into your data warehouse and collaborating closely with business users to establish a strong environment for analytics. Ultimately, data warehouses are about making better decisions in a timely manner, and these suggestions can help you further the utility of your data warehouse.
Stephen McDaniel is an Chief Data Scientist at Freakalytics, LLC and author of several books on analytic software. Eileen McDaniel, PhD, is author of The Accidental Analyst and Director of Analytic Communications at Freakalytics. Both work with clients on strategic analytic projects, teach courses on analytics and are on the faculty at INFORMS.
At the Qlik World 2014 expert data visualization panel, moderated by Donald Farmer, Donald asked each panelist to offer up one of our favorite data visualizations for inspiration and learning.
Alberto Cairo, Visualization at The University of Miami, offered up John Snow's 1854 map of London that helped demonstrate that cholera was spread by contaminated water. Previously, many believed it was spread by noxious vapours or "bad air".
By meticulously canvassing the St James neighborhood and collecting high-quality data, John Snow was able to convince the local authorities to disable the well at the center of his cholera victim's map. The points are recorded overlaid on a contemporary map of the neighborhood.
Kaiser Fung, Business Analytics and Data Visualization at NYU, cited a highly interactive work exploring the distribution of regional dialects in the United States, "How Y’all, Youse and You Guys Talk". After answering a number of questions about how you would pronounce various words, this data visualization shows you the parts of the country that speak most similar to you. We really like that it gives you previews of your regional dialect based on the last answer as you complete the numerous questions - great motivation to keep you going!
Stephen (of Freakalytics) cited Hans Rosling's TED talk about myths and realities of people in the "developing world". This talk is engrossing because he uses data visualization -and- his extensive work with this body of data to tell several surprising narratives that debunk many pre-conceived notions about the developing world (mistakenly characterized by many as the third world).
While people are impressed by the cool motion of the bubble charts and Hans's passion about the topic, the real message is that storytelling doesn't just happen by throwing up some charts. It's about being knowledgeable about your data, the topic you are discussing and bringing forth thoughtful analyses to inform and spur your audience towards better actions.
I was asked by Donald Farmer of Qlik about my favorite charts. Donald is leading a keynote panel on data visualization at Qlik's World Conference with myself, Alberto Cairo and Kaiser Fung.
While I can't say that I have a favorite chart, I can definitely state that I often rely heavily on three chart types for much of my work with clients.
#1 A bar chart is likely the most versatile chart type. Capable of representing data by category, data over time and, my favorite, as aligned bars (sometimes called trellised or latticed).
#2 Data over time is big in business, especially seeing how we are performing this year versus last. Here's a great way to easily see this year (bright purple) versus last year (light gray) in a line chart. We are doing much better this year, with only April and May showing low or no growth.
#3 Maps are critical to understanding business performance by location. I often scale the data against a benchmark or target and use a diverging color palette to find best performers and places that may need some help or guidance by management.
Using data from the first four seasons of the Shark Tank, Freakalytics has assembled a few fascinating insights for fans and potential entrepreneurs that may come before the Sharks in future seasons.
While Barbara Corcoran is the most frequent investor. Mark Cuban is the investor with the largest amount invested and Mr. Wonderful invests the most, on average.
Lori Greiner paid the highest amount relative to the valuation proposed by the entrepreneur. Note a significant number of the investments were made at 1 times investor valuation, but the vast majority of investments made were below entrepreneur asking valuation.
Just three of the Sharks appeared on all four seasons- Mr. Wonderful, Daymond John and Robert Herjavec. Of these three, Daymond invested the most total dollars.
Notice how showing the same information from the previous chart, a tree map, as a line chart offers different insights at first glance. Investor frequency of appearance and trends by season are now much clearer.
Bringing several charts together as an interactive, analytic dashboard. Notice refinements to the scatter plot- a reference line of average investment size and average valuation vs ask.
Barbara is among the most conservative investors on the show, paying just 49% of entrepreneur valuation and outlaying an average investment of just $92k.
Lori is the most aggressive investor, with an average 87% valuation vs entrepreneur ask. She often invests in products ready for the market now and needing a rapid go to market plan.
Mark Cuban, I will allow you to examine this last Shark selected in the analytic dashboard, for insights.
A few closing thoughts from my analysis
+ Of active investors, Barbara & Daymond pay least on company asking value, 49% & 53%
+ Lori willing to pay the most versus company ask, 87%
- Perhaps due to her quick payback horizon as a TV/marketing expert
- Paid highest ratio ever, 300%
+ Mark Cuban invested largest amount, $1.25M
- Strikes solid bargain at 63% of asking value
+ Mr. Wonderful has largest average investment size at $175k
+ Similar to Mark, 57% of company asking value
+ Robert is most conservative, $90k/investment
- 65% of company asking valuation
Disclaimer- there are potential data entry problems in this data and my analysis assumptions vary slightly from other analyses I found online of Shark Tank. However, I believe key trends and insights are substantially similar versus other analyses I found online. Royalty and loan schemes were ignored in this analysis.