Thanks for taking the time to read my thoughts about Visual Business Intelligence. This blog provides me (and others on occasion) with a venue for ideas and opinions that are either too urgent to wait for a full-blown article or too limited in length, scope, or development to require the larger venue. For a selection of articles, white papers, and books, please visit my library.

 

Drowning in the Shallows

July 1st, 2016

Expertise is developed through “deep work.” This term was coined by Cal Newport to describe the highly-focused periods of concentration that are required, not just to develop expertise, but to do good work in almost any field of endeavor. He defines deep work as:

Professional activities performed in a state of distraction-free concentration to push cognitive capabilities to their limit. These efforts create new value, improve your skill, and are hard to replicate.

Achievement in almost all fields of endeavor, and especially in all forms of knowledge work, demands deep work. Most knowledge workers today spend their time drowning the shallows. Newport defines “shallow work” as:

Noncognitively demanding, logistical-style tasks, often performed while distracted. These efforts tend to not create much new value in the world and are easy to replicate.

Newport, an assistant professor of computer science at Georgetown University and the author of the blog Study Hacks as well as the book So Good They Can’t Ignore You, has written a new book titled Deep Work: Rules for Focused Success in a Distracted World. Reading Deep Work this week was a perfect continuation and expansion of the thoughts that I expressed in my blog on June 13, “Data Visualizaton Lite,” for a lack of deep work is probably the reason why most recent work in the field of data visualization is splashing around in the shallows.

Deep Work by Cal Newport

This book is based on the following hypothesis about deep work:

The ability to perform deep work is becoming increasingly rare at exactly the same time it is becoming increasingly valuable in our economy. As a consequence, the few who cultivate this skill, and then make it the core of their working life, will thrive.

I believe that Newport’s hypothesis is valid. Over the years I have written and spoken many times about the importance of slowing down and thinking deeply, over an extended period of time, as the path to understanding and also to a fulfilling professional life. I work hard to create space for regular deep work. It is for this reason that I have always avoided all forms of social media (Facebook, LinkedIn, Twitter, etc.), for it offers me little compared to the overwhelming distraction that it would create. During much of my professional life I struggled to carve out opportunities for deep work in organizations that were not designed to support it. Founding Perceptual Edge made it possible for me to design a workplace and schedule that supports the deep work that makes me happy and productive.

In this book, Newport explains what deep work involves, why it’s important, what makes it so difficult to experience among today’s knowledge workers, and how these difficulties can be overcome. Newport’s insight that deep work is needed is not new, but what he’s done with this book is. He has exposed the problem and prescribed its remedy in a way that perfectly fits our current, technologically “connected” world. If you struggle to reach your cognitive potential as your mind flits from shallow thought to shallow thought, frenetically busy but not productive, I recommend that you read Deep Work.

Take care,

Signature

To Err Is Human

June 28th, 2016

My revision of Alexander Popes words, “To err is human, to forgive, divine,” is not meant to diminish the importance of forgiveness, but instead to promote the great value of errors as learning opportunities. We don’t like to admit our mistakes, but it’s important that we do. We all make errors in droves. Failing to admit and learn from private errors may harm no one but ourselves, but this failure has a greater cost when our errors affect others. Acknowledging public errors, such as errors in published work, is especially important.

I was prompted to write this by a recent email exchange. I heard from a reader named Phil who questioned a graph that appeared in an early printing of my book Information Dashboard Design (First Edition). This particular graph was part of a sales dashboard that I designed to illustrate best practices. It was a horizontal bar graph with two scales and two corresponding series of bars, one for sales revenues and one for the number of units sold. It was designed in a way that inadvertently encouraged the comparison of revenues and unit counts in a way that could be misleading (see below).

Original

I would not design a graph in this manner today, but when I originally wrote Information Dashboard Design in 2005, I had not yet thought this through. This particular graph was further complicated by the fact that the scale for units was expressed in 100s (e.g., a value of 50 on the scale represented 5,000), which was a bit awkward to interpret. I fixed the dual-scale and units problem in the book long ago (see below).

Current

I began my response to Phil’s email with the tongue-in-cheek sentence, “Thanks for reminding me of past mistakes.” I had forgotten about the earlier version of the sales dashboard and Phil’s reminder made me cringe. Nevertheless, I admitted my error to him and now I’m admitting it to you. I learned from this error long ago, which relinquishes most of this admission’s sting. Even had the error persisted to this day, however, I would have still acknowledged it, despite discomfort, because that’s my responsibility to readers, and to myself as well.

When, in the course of my work in data visualization, I point out errors in the work of others, I’m not trying to discourage them. Rather, I’m firstly hoping to counter the ill affects of those errors on the public and secondly to give those responsible for the errors an opportunity to learn and improve. This is certainly the case when I critique infovis research papers. I want infovis research to improve, which won’t happen if poor papers continue to be published without correction. This was also the case when I recently expressed my concern that most of the books written about data visualization practices in the last decade qualify as “Data Visualization Lite.” I want a new generation of data visualization authors and teachers to carry this work that I care about forward long after my involvement has cease. I want them to stand on my shoulders, not dangle precariously from my belt.

Imagine how useful it would be for researchers to publish follow-ups to their published papers a few years after they’re published. Researchers could correct errors and describe what they’ve learned since publication. They could warn readers to dismiss claims that have since been shown invalid. They could describe how they would redesign the study if they were doing it again. This could contribute tremendously to our collective knowledge. How often, however, do authors of research papers ever mention previous work, except briefly in passing? What if researchers were required to maintain an online document that is linked to their published papers, to record all subsequent findings affecting the content of the original paper. As it is now, bad research papers never die. Most are soon forgotten, assuming they were ever noticed in the first place, but they’re often kept alive for many years through citations, even when they’ve been deemed unreliable.

A similar practice could be followed by authors of books. Authors sometimes do this to some degree when they write a new edition of a book. Two of my books are now in their second editions. Most of the changes in my new editions involve additional content and updated examples, but I’ve corrected a few errors as well. Perhaps I should have included margin notes in my second editions to point out content that was changed since the first to correct errors. This might be distracting for most readers, however, especially those who hadn’t read the previous edition, but I could provide a separate document on my website of those corrections for anyone who cares. Perhaps I will in the future.

Errors are our friends if we develop a healthy relationship with them. This relationship begins with acceptance, continues through correction, and lives on in the form of better understanding. Those who encourage this healthy relationship by opening their work to critique and by critiquing the work of others are likewise our friends. If I’ve pointed out errors in your work, I’m not your enemy. If you persist in spreading errors to the world despite correction, however, you become an enemy to your readers.

Data visualization matters. It isn’t just a job or field of study, it’s a path to understanding, and understanding is our bridge to a better world.

Take care,

Signature

We Must Vet Our Data Visualization Advisers with Care

June 24th, 2016

When we need advice in our personal lives, to whom do we turn? To someone we trust, who has our interests at heart and is wise. So why then do we often rely on advisers in our professional lives whose interests are in conflict with our own? If your work involves business intelligence, analytics, data visualization, or the like, from whom do you seek advice about products and services? If you’re like most professionals, you unwittingly seek advice from people and organizations with incentives to sell you something. You either get advice from the vendors themselves, from technology analysts with close ties to those vendors, or from journalists who are secretly compensated by those vendors. That’s not rational, so why do we do it? Usually because it’s convenient and sometimes because we don’t really care if the advice is good or not, for it is our employers, not us, who will suffer the consequences. If we actually care, however, we should do a better job of vetting our advisers.

It should be obvious that we cannot expect objectivity from the vendors themselves. Even when a vendor’s employees post advice from independent websites and claim that their opinions are their own, they remain loyal to their employers. In fact, it’s a great marketing ploy for vendors to have their employees post advice from independent sites rather than from their own. It suggests a level of objectivity that serves the vendor’s interests and multiples their presence on the web. We must also question with similar suspicion the objectivity of consultants and teachers who have built their work around a single product.

What about technology analyst groups, such as Gartner, Forrester, and TDWI, to name a few of the big guys? These organizations fail in many ways to maintain a healthy distance from the very technology vendors that are the subject of their advice. In fact, they are downright cozy with the vendors.

Trustworthy technology advisers go to great pains to maintain objectivity. They are few and far between. To be objective, I believe that advisers should do the following:

  • Disclose all of their relationships with vendors. This is especially true of relationships that involve the exchange of money. If they accept money from vendors, they should willingly disclose the figures upon request.
  • Do not allow vendors to advertise on their websites, in their publications, or at their events.
  • Only accept payments from vendors for professional services specifically rendered to improve the vendor’s products or services. Payments for marketing advice does not qualify.
  • Do not publish content prepared by vendors.

Try to find technology analysts and journalists who follow these guidelines. Even with diligent effort, you won’t find many, because there aren’t many to find.

Try an experiment. If your company subscribes to one of the big technology analyst services (Gartner, etc.), next time they produce a report that scores BI, analytics, or data visualization products, ask them for a copy of the data on which they based those scores, along with the algorithms that processed the data. This is likely done in an Excel spreadsheet, so just ask them for a copy of the file. After making the request, watch them squirm and expect creative excuses. Most likely they’ll say something along these lines: “Our scoring system is based on a sophisticated and proprietary algorithm that we cannot make public because it gives us an edge over the competition.” Bullshit. There is definitely a secret in that spreadsheet that they don’t want to share, but it is not a sophisticated algorithm.

After they refuse to show their work, move on to the following request: “Please give me a list of the vendors that you evaluated along with the amount of money that you have received from each for the last few years.” They won’t give it to you, of course, and they’ll explain that they cannot for reasons of confidentiality. Think about that for a moment. It is no doubt true that they promised to never reveal the money that changed hands between them and the vendors, but shouldn’t this clear conflict of interest be subject to scrutiny? Technology analysts and the vendors that they support are not fans of transparency.

There are a few technology advisers who do good work and do it with integrity. If you want objective and expert advice from someone who is looking out for your interests, be sure to vet your advisers with diligence and care. Question their motives. If it looks like they’re acting as an extension of vendor marketing efforts, they probably are. If, on the other hand, you’re just looking for easy answers, abandon all skepticism and do a quick Google search and then read the advice that receives top ranking. Or, better yet, schedule a call with the analyst group for whose advice you pay dearly in the form of an annual subscription.

Take care,

Signature

(Postscript: Yes, I consider myself one of the few data visualization advisers in whom you can trust.)

Data Visualization Lite

June 13th, 2016

In the world of data visualization, we are progressing at a snail’s pace. This is not the encouraging message that vendors and many “experts” are promoting, but it’s true. In the year 2004, I wrote the first edition of Show Me the Numbers in response to a clear and pressing need. At the time no book existed that pulled together the principles and best practices of quantitative data presentation and made them accessible to the masses of mostly self-trained people who work with numbers. I was originally inspired by the work of Edward Tufte, but realized that his work, exceptional though it was, awed us with a vision of what could be done without actually showing us how to do it. After studying all of the data visualization resources that I could find at the time, I pulled together the best of each, combined it with my own experience, gave it a simple and logical structure, and expressed it comprehensibly in accessible and practical terms. At that time, data visualization was not the hot topic that it is today. Since then, as the topic has ignited the imagination of people in the workplace and become a dominant force on the web, several books have been written about quantitative data presentation. I find it disappointing, however, that almost nothing new has been offered. With few exceptions, most of the books that have been written about data visualization, excluding books about particular tools or specific applications (e.g., dashboard design), qualify as data visualization lite.

Those books written since 2004 that aren’t filled with errors and poor guidance, with few exceptions, merely repeat what has been written previously. Saying the same old thing in a new voice is not helpful unless that new voice reaches an audience that hasn’t already been addressed or expresses the content in a way that is more informative. Most of the new voices are addressing data visualization superficially, appealing to an audience that desires skill without effort. As such, they dangle a false promise before the eager eyes of lazy readers. Data visualization lite is not a viable solution to the world’s need for clear and accurate information. Instead, it is a compromise tailored to appeal to short attention spans and a desire for immediate expertise, which isn’t expertise at all.

In a world that longs for self-service business intelligence, naively placing data sensemaking and communication in the same category as pumping gas, we need fresh voices to proclaim the unpopular truth that these skills can only be learned through thoughtful training and prolonged practice. It is indeed true that many people in our organizations can learn to analyze and present quantitative data effectively, but not without great effort. We don’t need voices to reflect the spirit of our time; we need voices to challenge that spirit—voices of transformation. Demand depth. Demand lessons born of true expertise. Demand evidence.

Where are these fresh and courageous voices? Who will light the way forward? There are only a few who are expressing new content, addressing new audiences, or expressing old content in new and useful ways. Until we demand more thoughtful and transformative work, the future of data visualization will be dim.

Take care,

Signature

Avoiding Quantitative Scales That Make Graphs Hard to Read

May 24th, 2016

This blog entry was written by Nick Desbarats of Perceptual Edge.

Every so often I come across a graph with a quantitative scale that is confusing or unnecessarily difficult to use when decoding values. Consider the graph below from a popular currency exchange website:

Example of poorly chosen quantitative scale
Source: www.xe.com

Let’s say that you were interested in knowing the actual numerical value of the most recent (i.e., right-most) point on the line in this graph. Well, let’s see, it’s a little less than halfway between 1.25 and 1.40, so a little less than half of… 0.15, so about… 0.06, plus 1.25 is… 1.31. That feels like more mental work than one should have to perform to simply “eyeball” the numerical value of a point on a line, and it most certainly is. The issue here is that the algorithm used by the graph rendering software generated stops for the quantitative scale (0.95, 1.10, 1.25, etc.) that made perceiving values in the graph harder than it should be. This is frustrating since writing an algorithm that generates good quantitative scales is actually relatively straightforward. I had to develop such an algorithm in a previous role as a software developer and derived a few simple constraints that consistently yielded nice, cognitively fluent linear scales, which I’ve listed below:

1. All intervals on the scale should be equal.

Each interval (the quantitative “distance” between value labels along the scale) should be the same. If they’re not equal, it’s more difficult to accurately perceive values in the graph, since we have to gauge portions of different quantitative ranges depending on which part of the graph we’re looking at (see example below).

Unequal intervals
Source: www.MyXcelsius.com

2. The scale interval should be a power of 10 or a power of 10 multiplied by 2 or 5.

Powers of 10 include 10 itself, 10 multiplied by itself any number of times (10 × 10 = 100, 10 × 10 × 10 = 1,000, etc.), and 10 divided by itself any number of times (10 ÷ 10 = 1, 10 ÷ 10 ÷ 10 = 0.1, 10 ÷ 10 ÷ 10 ÷ 10 = 0.01, etc.). We find it easy to think in powers of 10 because our system of numbers is based on 10. We also find it easy to think in powers of 10 multiplied by 2 or 5, the two numbers other than itself and 1 by which 10 can be divided to produce a whole number (i.e., 10 ÷ 2 = 5 and 10 ÷ 5 = 2). Here are a few examples of intervals that can be produced in this manner:

Sample Powers of 10

Here are a few examples good scales:

Good Scales

Here are a few examples of bad scales:

Bad Scales

After this post was originally published, astute readers pointed out that there are some types of measures for which the “power of 10 multiplied by 1, 2 or 5” constraint wouldn’t be appropriate, specifically, measures that the graph’s audience think of as occurring in groups of something other than 10. Such measures would include months (3 or 12), seconds (60), RAM in Gigabytes (4 or 16) and ounces (16). For example, a scale of months of 0, 5, 10, 15, 20 would be less cognitively fluent than 0, 3, 6, 9, 12, 15, 18 because virtually everyone is used to thinking of months as occurring in groups of 12 and many business people are used to thinking of them in groups of 3 (i.e., quarters). If, however, the audience is not used to thinking of a given measure as occurring in groups of any particular size or in groups that number a power of 10, then the “power of 10 multiplied by 1, 2 or 5” constraint would apply.

3. The scale should be anchored at zero.

This doesn’t mean that the scale needs to include zero but, instead, that if the scale were extended to zero, one of the value labels along the scale would be zero. Put another way, if the scale were extended to zero, it wouldn’t “skip over” zero as it passed it. In the graph below, if the scale were extended to zero, there would be no value label for zero, making it more difficult to perceive values in the graph:

Extended scale does not include zero stop
Source: www.xe.com, with modifications by author

In terms of determining how many intervals to include and what quantitative range the scale should span, most graph rendering applications seem to get this right, but I’ll mention some guidelines here for good measure.

Regarding the actual number of intervals to include on the scale, this is a little more difficult to capture in a simple set of rules. The goal should be to provide as many intervals as are needed to allow for the precision that you think your audience will require, but not so many that the scale will look cluttered, or that you’d need to resort to an uncomfortably small font size in order to fit all of the intervals onto the scale. For horizontal quantitative scales, there should be as many value labels as possible that still allow for enough space between labels for them to be visually distinct from one another.

When determining the upper and lower bounds of a quantitative scale, the goal should be for the scale to extend as little as possible above the highest value and below the lowest value while still respecting the three constraints defined above. There are two exceptions to this rule, however:

  1. When encoding data using bars, the scale must always include zero, even if this means having a scale that extends far above or below the data being featured.
  2. If zero is within two intervals of the value in the data that’s closest to zero, the scale should include zero.

It should be noted that these rules apply only to linear quantitative scales (e.g., 70, 75, 80, 85), and not to other scale types such as logarithmic scales (e.g., 1, 10, 100, 1,000), for which different rules would apply.

In my experience, these seem to be the constraints that major data visualization applications respect, although Excel 2011 for Mac (and possibly other versions and applications) happily recommends scale ranges for bar graphs that don’t include zero, and seems to avoid scale intervals that are powers of 10 multiplied by 2, preferring to use only powers of 10 or powers of 10 multiplied by 5. I seem to be coming across poorly designed scales more often, however, which is probably due to the proliferation of small vendor, open-source and home-brewed graph rendering engines in recent years.

Nick Desbarats