Thanks for taking the time to read my thoughts about Visual Business
Intelligence. This blog provides me (and others on occasion) with a venue for ideas and opinions
that are either too urgent to wait for a full-blown article or too
limited in length, scope, or development to require the larger venue.
For a selection of articles, white papers, and books, please visit
October 28th, 2015
During the last two days, I spent a great deal of time corresponding with my friend Alberto Cairo after he informed me that he was hosting a public lecture by David McCandless at the University of Miami. Alberto and I are both critical of McCandless’ infographics. I am more passionate in my criticism, however, perhaps because I frequently and directly encounter the ill effects of McCandless’ influence. More than anyone else working in data visualization today, McCandless has influenced people to design data visualizations in ways that are eye-catching but difficult to read and often inaccurate. Also more than anyone else, when my readers and students talk about the challenges that they face in the workplace because their bosses and clients expect eye-candy rather than useful information effectively displayed, they identify McCandless as the source of this problem.
You can imagine my dismay when Alberto told me about the lecture. I argued that he shouldn’t provide McCandless with a forum for promoting his work unless he also provided a critique of that work during the event. Alberto’s position was that, as an academic and a journalist, he should provide a platform for anyone whose work in the field of data visualization is known, regardless of quality or the harm that it does. Further, he argued that his students and those who have read his books already know that he finds much of McCandless’ work lacking. My response to this was, “What about those who attend the event but are not your students or readers?” After the discussion, I found myself wanting to ask one more question: “What do I say to someone who tells me that his boss attended the lecture, and this exposure to McCandless’ work set his efforts to promote effective practices back by several years?” Even worse, what if he also says, “Steve, I encouraged my boss to attend the lecture because it was hosted by Alberto Cairo, whose work you’ve praised.”
To no avail, I pleaded with Alberto to provide a counterpoint to the presentation to make it clear to attendees that McCandless often promotes practices that are ineffective. I argued that without providing this counterpoint, he was abdicating his responsibility as a teacher and a journalist. He saw it differently. He replied that his indirect approach to combating ineffective practices is perhaps more effective than my direct approach.
Is Alberto right? Was it appropriate for him to host a public lecture by McCandless without offering a counterpoint? Should I become less direct in my criticism of harmful practices? Will they cease to plague our work faster if I do? What does your experience tell you?
October 19th, 2015
You have likely heard that experts are notoriously bad at forecasting. This insight comes from the work of Philip Tetlock who has written a brand new book titled Superforecasting: The Art and Science of Prediction.
There are approximately 20,000 intelligence analysts working in the United States today for various agencies, “assessing everything from minute puzzles to major events such as the likelihood of an Israeli sneak attack on Iranian nuclear facilities or the departure of Greece from the Eurozone.” When intelligence agencies get things wrong (e.g., weapons of mass destruction in Iraq) or fail to predict a tragic event (e.g., 9/11), they get beat up in the media and by politicians. Forecasting these events is extremely complex and can never be done with certainty. Rather than slinking away and covering their asses, beginning in 2011, the Intelligence Advanced Research Projects Activity (IARPA), the DARPA of the intelligence community, created a tournament in an attempt to learn how their forecasts could be improved. In the beginning, five teams competing by generating forecasts in response to the types of tough questions that intelligence analysts routinely face. One of those five teams was the Good Judgment Project (GJP), organized by Tetlock and his research (and life) partner Barbara Mellers. After two years, the GJP team was doing so much better than the others, IARPA dropped the others and focused entirely on what they could learn from the GJP team. Over four years these forecasters responded to nearly 500 questions about world affairs, with predictive timelines extending from just over a month to a little less than a year. During the course of this tournament, Tetlock and Mellers were able to hone the team into a select group that they called “superforecasters.” The book Superforecasters presents their findings, and they’re extraordinary.
To determine the best forecasts, predictions had to be made in clear and unambiguous terms that could be later compared to actual events. Also, the effectiveness of forecasters had to be measured through hundreds of forecasts to confidently identify the best. Once identified, these superforecasters were carefully observed to find out what separated them from the others. Here, in a nutshell, is a summary of Tetlock’s and Mellers’ findings:
Superforecasting does require minimum levels of intelligence, numeracy, and knowledge of the world, but anyone who reads serious books about psychological research [such as Tetlock’s] probably has those prerequisites. So what is it that elevates forecasting to superforecasting? …What matters most is how the forecaster thinks… Broadly speaking, superforecasting demands thinking that is open-minded, careful, curious, and—above all—self-critical. It also demands focus. The kind of thinking that produces superior judgment does not come effortlessly. Only the determined can deliver it reasonably consistently, which is why our analyses have consistently found commitment to self-improvement to be the strongest predictor of performance. (p. 20)
Do not be satisfied with this summary. Many of the qualities that produce effective forecasts can be developed. If you’re reading this blog, chances are they’re within your reach. Read this fascinating book for yourself to discover and understand what Tetlock and Mellers found.
October 15th, 2015
After the completion of my 2012 Dashboard Design Competition, I created my own version of the Student Performance Dashboard based on the same data that the competitors used. Since then, a few individuals and software vendors have asked for a copy of the data so they could reproduce my version of the dashboard using their dashboard-creation tool of choice. Recently, I received such a request from an application developer named Robert Monfera. He wanted to create a functional version of the dashboard using d3, a programming tool for creating rich graphs for the Web. In recent years, d3, created by Michael Bostock of the New York Times when he was a doctoral student at Stanford, has become the preferred tool among graphics designers and developers for creating infographics and web-based analytical applications, when a commercial data visualization product won’t to. Robert wanted to create the Student Performance Dashboard using d3 as a learning exercise. As a reminder, here’s a small section of the dashboard:
I was happy to give Robert the data along with permission to recreate my design, but this evolved into enthusiasm when Robert began to show me what he could do with d3. I quickly realized that he was not your everyday software developer. I invited him to add some sorting functionality that I wanted to demonstrate with a working example of the dashboard and promised to showcase his work when it was ready—and now it is.
When I introduced my version of the Student Performance Dashboard, which appears in the second edition of my book Information Dashboard Design, I acknowledged that the ability to sort the rows of student information in various ways could be useful. However, I suggested that this should be implemented in a way that always automatically reverts back to the original sort order immediately after viewing the data. This is because dashboards will only work for rapid performance monitoring if they present the data in the same manner from day to day, without alteration. Otherwise, people will never learn to use them rapidly because of the disorientation that is caused when anything other than the data changes, including the way in which items are sorted. Ideally, I wanted the interface to allow the viewer to click on a column of data, which would cause the rows to sort based on the variable contained in the column for as along as the mouse button was held down and then revert to the original order as soon as the button was released. Robert was able to implement this functionality as I envisioned, causing the rows to visibly reorder, taking just enough time to do so for the viewer to notice if only a few or many rows needed to be rearranged to exhibit the new order. It works beautifully, which you can see for yourself by interacting with the dashboard.
Even though I wouldn’t ordinarily include brushing and filtering functionality in a performance monitoring dashboard, Robert wanted to see its effects for himself, so he added this functionality as well. When you select one or more rows in the dashboard the summary graphs at the bottom of each row are filtered to reflect the selected students only. To select multiple rows, simply click and drag across the entire set. You can select non-contiguous rows by holding down the Ctrl key and either clicking individual rows or dragging down multiple rows.
I should let Robert speak about his work for himself. Here’s what he wrote to me:
D3.js is a leading tool for data visualization development I’ve used for my customers’ options trade charting, machine learning visualizations and an environmental dashboard. Mike Bostock, who created d3.js, and others provided a wealth of bite-sized examples, including your bullet graph.
Your books and articles on dashboard design match my experience that people are interested in context and detail behind the focus, rather than regressing to sparse presentations for their apparent simplicity and appeal. Many of your articles deal with simplicity vs. complexity, and your teachings lead to complex, information-dense, yet intelligible, tailored information graphics, also rooted in Tufte’s findings, such as this:
If the visual task is contrast, comparison and choice – as so often it is – then the more relevant information within eyespan, the better. Vacant, low-density displays, the dreaded posterization of data spread over pages and pages, require viewers to rely on visual memory – a weak skill…
My learning goals called for an implementation of a data-rich dashboard with d3.js that can be the basis for further, shared experimentation. The best publicly accessible dashboard design resource I know of, your 2012 Dashboard Design Competition, even has multiple realizations, including your take on the challenge, which includes some of your research, e.g. on color and bandlines, which, similar to your bullet graphs, are poised to become widespread. Also, I feel deeply about education via visualization using, as Bret Victor calls it, Magic Ink as an underexplored medium to help people understand and learn.
The experiment involved complex, deeply nested visual elements, following the simplest structures and lightest abstractions – testing d3.js on an exacting, detailed, bespoke design, without involving other structuring tools I’d use for clients. While the program has specific shortcomings, the upshot is that problems and patterns have emerged that wouldn’t have come up with a much smaller task.
A couple of interactions were also added – no claims about their utility – even though a dashboard, as you define it, is not an exploratory data analysis tool. Some of them look interesting: for example, clicking on a column heading and seeing how many (or how few) lines move around, and by how much, is viscerally revealing of correlations with grade score. Also, the aggregate bandline seems informative and engaging in its transitions as the set of rows changes during brushing.
Planned work involves the factoring out of the bandline function for easy reuse by everyone; improving on the data binding abstraction (also in light of d3 v4), proper code structuring and subsequent open sourcing. Interactions on mobile devices are not yet enabled. Fixed coordinates should be replaced by configuration or automatic layout optimization.
Steve, I’m grateful for the permission you gave me for implementing your design and using your data, and our discussions about design. It’s been an educational journey and starting point for further exploration.
To explore Robert’s work on your own, simply click the dashboard image below to access the working version. To contact Robert Monfera, you may reach him via email at monfera.robert at gmail.com.
September 30th, 2015
In my recent newsletter article titled “A Course of Study in Analytical Thinking” I included “scientific thinking” as a specific type of thinking that we should understand and practice as data sensemakers. For this particular topic, I recommended the book A Beginner’s Guide to Scientific Method, Fourth Edition, by Stephen S. Carey as a useful introduction, but admitted that I had not yet read the book. I read others on the topic that didn’t suit the need and Carey’s book seemed to be the best bet based on the author’s description and the comments of several satisfied readers. Within a day or two of the article’s publication my copy of the book finally arrived and I’m relieved to say that it’s a perfect fit.
It’s a short book of only 146 pages (including the index), but it covers the topic beautifully. It even includes quizzes and exercises for the dedicated learner. I especially appreciate its thoughtful focus on the essence of science and scientific method, never venturing into territory that non-scientists would find esoteric or intimidating. If you’re like me, you probably assumed that there were many good books of this type available, but this is surprisingly not the case. Given the importance of science and the fact that everyone should understand what it is and how it is essentially performed, this is a tragic void. Thankfully, Carey must have recognized this two decades ago when he wrote the first edition and has continued to serve the ongoing need by updating it every few years with current examples.
Carey breaks the content into six chapters:
- Science (This chapter defines science, describes the methods that are common across all branches of science, and argues for its importance.)
- Observation (This chapter describe the process of effective observation.)
- Explanation (This chapter focuses on the goal of explaining “why things happen as they do in the natural world,” including the special role of hypotheses and theories.)
- Experimentation (This chapter describes the role of experimentation, various types of experiments, and the ways experiments should be designed and conducted to produce reliable findings.)
- Establishing Causal Links (This chapter extends the topic of experimentation by addressing the special techniques, including statistics, that must be used to establish causation.)
- Fallacies in the Name of Science (This chapter draws a clear distinction between science and pseudo-science, including basic tests for distinguishing science from its imitation.)
Unless you’re already trained in the ways of science, you’ll find this book enlightening and enjoyable. It’s quite possible that you’ve already published a research paper in your field of study but somehow never learned what this little book teaches. I’ve read many research papers, especially in my field of information visualization, which had the appearance of science, with technical jargon and lots of statistics (often misapplied), but were in fact pseudo-science because the researchers and their professors did not understand the basic methods of science. So many time-consuming but ultimately worthless projects might have been salvaged had the researchers read this simple little book.
P.S. When I wrote this blog post, I’d forgotten how horribly expensive this books is. It lists for almost $100. Even discounted, it will still cost you nearly $80. This is unconscionable. I doubt that it was the author’s decision to price it out of reach. I suspect that this is an example of Wadsworth Publishing’s shortsightedness. They see it as a textbook that only students will purchase – students who will have no choice in the matter. In fact, this book would have a broad audience if it were reasonably priced; so much so that the publisher and author would earn a great deal more money. What a shame! Until this changes, try to find yourself a used copy.
August 31st, 2015
It annoys me when I see poor journalistic infographics, in part, because I value journalism and I hate to see it done ineffectively. Good news organizations take the quality of their journalist’s writing seriously. Journalists and their editors work hard to produce news stories that are accurate, clear, and compelling. Those who can’t write effectively lose their jobs. So, why is it that some of the same publications that take great pains to produce well-written articles don’t bat an eye when they produce infographics that are inaccurate or unnecessarily difficult to understand?
Take the following infographic recently published by Time as an example. The topic is important, “Why We Still Need Women’s Equality Day,” but notice how unnecessarily hard you must work to get the information and how difficult it is to compare the values that it displays and combine them into a sense of the whole.
This infographic provides eight measures of women’s participation in government. Each measure is expressed as a percentage of female vs. male participation. So why is each measure presented graphically in a different way? A single graphical form that makes the percentages of female vs. male participation easy to read and understand for all of the eight measures would work so much better. Also, given the fact that there is value in comparing the eight measures, why does the infographic arrange them vertically in a way that no computer or tablet screen could contain without scrolling? And even if all eight measures could fit on a single screen, because every one is expressed in a different manner, they still couldn’t be quickly and easily compared.
Has anything been gained by displaying the eight measures in these various ways? Some infographic designers would argue that by displaying the measures differently, visual interest has been increased, resulting in greater reader engagement. I suppose that there are people who might actually find this variety of expression engaging, but only in a way that draws them into the pictures independent of their meanings. Is someone who reads this article merely to enjoy the pictures with little concern for the content and its meaning an appropriate audience? Only if the journalist is trying to win a popularity contest among the disinterested.
Here’s the same story told primarily in graphical form, but this time it is clear, simple to read, makes comparisons easy, and brings the measures together is a way that makes the whole available at a glance.
A great deal has been gained through this redesign, but has anything been lost? Nothing other than meaningless, complicating, and distracting variety.
Isn’t it time that we demand of graphical journalists the same standards of effectiveness that we demand of their traditional counterparts? Journalism is journalism. Whether the story is told in words, pictures, or both should be determined by the nature of the information and the integrity of that information should always be respected.