Thanks for taking the time to read my thoughts about Visual Business
Intelligence. This blog provides me (and others on occasion) with a venue for ideas and opinions
that are either too urgent to wait for a full-blown article or too
limited in length, scope, or development to require the larger venue.
For a selection of articles, white papers, and books, please visit
August 17th, 2016
Exploring and analyzing data is not at all like pumping your own gas. We should all be grateful that when gas stations made the transition from full service to self service many years ago, they did not relegate auto repair to the realm of self service as well. Pumping your own gas involves a simple procedure that requires little skill.
Repairing a car, however, requires a great deal of skill and the right tools.
The same is true of data exploration and analysis (i.e., data sensemaking).
Self service has become one the most lucrative marketing campaigns of the last few years in the realms of business intelligence (BI) and analytics, second only to Big Data. Every vendor in the BI and analytics space makes this claim, with perhaps no exception. Self-service data sensemaking, however, is an example of false advertising that’s producing a great deal of harm. How many bad decisions are being made based on specious analytical findings by unskilled people in organizations that accept the self-service myth? More bad decisions than good, I fear.
Describing analytics as “self service” suggests that it doesn’t require skill. Rather, it suggests that the work can be done by merely knowing how to use the software tool that supports “self-service analytics.” Data sensemaking, however, is not something that tools can do for us. Computers are not sentient; they do not possess understanding. Tools can at best assist us by augmenting our thinking skills, if they’re well designed, but most of the so-called self-service BI and analytics tools are not well designed. At best, these dysfunctional tools provide a dangerous illusion of understanding, not the basis on which good decisions can be made.
Some software vendors frame their products as self service out of ignorance: they don’t understand data sensemaking and therefore don’t understand that self service doesn’t apply. To them, data sensemaking really is like pumping your gas. The few software vendors that understand data sensemaking frame their products as self service because the deceit produces sales, resulting in revenues. They don’t like to think of it as deceitful, however, but merely as marketing, the realm in which anything goes.
How did it become acceptable for companies that support data sensemaking—the process of exploring and analyzing data to find and understand the truth—to promote their products with lies? Why would we ever put our trust in companies that disrespect the goal of data sensemaking—the truth—to this degree? Honest vendors would admit that their products, no matter how well designed, can only be used effectively by people who have developed analytical skills, and only to the degree that they’ve developed them. This shouldn’t be a difficult admission, but vendors lack the courage and integrity that’s required to make it.
Some vendors take the self-service lie to an extreme, arguing that their tools take the human out of the loop of data sensemaking entirely. You simply connect their tools to a data set and then sit back and watch in amazement as it explores and analyzes the data at blinding speeds, resulting in a simple and complete report of useful findings. At least one vendor of this ilk—BeyondCore—is being hailed as a visionary by Gartner. This is the antithesis of vision. No skilled data analyst would fall for this ruse, but they unfortunately are not the folks who are usually involved in software purchase decisions.
Let’s be thankful that we can save a little money and time by pumping our own gas, but let’s not extend this to the realm of untrained data sensemaking. Making sense of data requires skills. Anyone of reasonable intelligence who wishes can develop these skills, just as they develop all other skills, through study and deliberate practice. That’s how I did it, and these skills have been richly rewarding. The people and organizations who recognize self-service analytics for the absurd lie that it is and take time to develop analytical skills will emerge as tomorrow’s analytical leaders.
August 12th, 2016
During the long course of my professional life, I’ve observed a disturbing trend. People sometimes claim expertise in one field based on experience in another. This is a fallacious and deceitful claim. I have extensive experience in visual design, but I cannot claim expertise in architecture. Any building that I designed would most certainly crumble around me. I’m a skilled teacher, but this does not qualify me as a psychotherapist. That hasn’t stopped me from occasionally giving advice to friends, but without charge, which probably matches its worth. Although these fields of endeavor overlap in some ways, expertise in one does not convey expertise in another. No concert violinist would claim the transfer of that virtuosity to the saxophone, but IT professionals sometimes make claims that are every bit as audacious.
The field of business intelligence (BI) provides striking examples of this trend. When BI initially emerged, data warehousing was the pre-existing field of endeavor that supplied BI with most of its initial workers and technologies. Years earlier, relational database theory and management supplied most of the initial workers and technologies of data warehousing. Today, the field of endeavor that goes by such names as analytics, data science, data visualization, and performance management, is the domain of workers and technologies that were previously associated with BI and in many cases still are. I know several individuals who began their careers as experts in relational databases, who then moved into data warehousing, and then into BI, and finally into analytics and its kin without actually developing expertise in any but their initial field of endeavor. Instead, they made names for themselves in relational databases or data warehousing, and then transferred that moniker to each subsequent field of endeavor with little study or experience, and thus little skill. Many of the people who give keynotes today at BI/Analytics/Big Data conferences and who write white papers on related topics fall into this category. This is one of the reasons why domains related to analytics are so confusing, hype-filled, and poorly realized.
The skill sets that were needed to design and build relational databases or even data warehouses are significantly different from those that are needed for expert data sensemaking. I know this quite well, because early in my career I studied and taught relationship database design, but when data warehousing emerged, I found that most of my skills were not transferable. I learned this the hard way by initially trying to build data warehouses using my relational database skills and failed miserably. Over the course of years, I retooled. When BI stole the limelight from data warehousing, I became fascinated by its intentions and vision, defined initially by Howard Dresner as “concepts and methods to improve business decision making by using fact-based support systems.” This harkened back to my first full-time job in IT, when I worked in the “Decision Support” group of a large semiconductor company. Even though I began my career helping people use data to support better decisions, when I began focusing on BI, relatively little that I had learned about data warehousing was useful. I had to shift my technology-centric focus back to a perspective that was line with my university studies in the social sciences. I needed to understand the human brain, the process of decision making, and the ways that technologies could assist in this essentially human activity. This took years and led me to entirely new areas of study, including human-computer interface design. Later, when I narrowed my focus to data visualization, once again I had to humbly accept the position of a novice. My previous studies and diverse areas of experience contributed a great deal to the eventual richness of my expertise in data visualization, but it did not bestow upon me the mantel of expertise. That, I had to earn through diligent study and years of deliberate practice. It is by these same diligent means that I continue to deepen and broaden my data visualization expertise today.
Many of those who think themselves data visualization experts today base this belief primarily on experience in graphic design. While it is true that expertise in graphic design can contribute to the development of expertise in data visualization, there is a great deal more to learn and practice if you wish to understand and effectively practice data visualization. As an expert in data visualization, I have as much of a right to claim expertise in graphic design as an expert graphic designer can rightfully claim expertise in data visualization, which is very little.
I’m tempted to say that “Expertise isn’t what it used to be.” It certainly seems that people make claims of expertise today with little actual knowledge or experience, but I suppose this might have always been so. I doubt it, however, for I believe that the ready availability of information on the web has inclined people to think that expertise is equally accessible. It isn’t. Whereas information can be looked up easily and quickly, expertise requires effort and time. It’s worthy of both.
August 9th, 2016
I’ve spent a great deal of time over the years, and especially during the last few months, thinking deeply about the role of technologies in exploratory data analysis. When we create technologies of all types, we should always think carefully about their effects. Typically, new technologies are created to solve particular problems or to satisfy particular needs, so we attempt to consider how well they will succeed in doing this. But this isn’t enough. We must also consider potential downsides—ways in which those technologies might cause harm. This is especially true of information technologies, and data sensemaking technologies in particular, but this is seldom done by the companies that make them. The prevailing attitude in our current technopoly is that information technologies are always inherently good—what possible harm could there be? Some attention is finally being given to the ways in which information can be misused, but this isn’t the only problem.
Whenever we hand tasks over to computers that we have always done ourselves, we run the risk of losing critical skills and settling for results that are inferior. Tasks that involve thinking strike at the core of humanity’s strength. We sit on the top of the evolutionary heap because of the unique abilities of our brains. Surrendering thinking tasks to technologies ought to be approached with great caution.
I’d like to share a few guidelines that I believe software companies should follow when adding features to exploratory data analysis tools. Please review the following list and then share with me your thoughts about these guidelines.
- Leave out any task that humans can do better than computers.
- Leave out any task that’s associated with an important skill that would be lost if we allowed computers to do it for us.
- Leave out any feature that is ineffective.
- Add features to perform tasks that computers can do better than humans.
- Add features to perform tasks that humans do not benefit from performing in some important way.
- Add features that are recognized as useful by skilled data analysts, but only after considering the full range of implications.
- Never add a feature simply because it can be added or because it would be convenient to add.
- Never add a feature merely because existing or potential customers ask for it.
- Never add a feature simply because an executive wants it.
- Never design a feature in a particular way because it is easier than designing it in a way that works better.
- Never design a feature that requires human-computer interaction without a clear understanding of the human brain—its strengths and limitations.
- Never design a feature that that requires human-computer interaction that forces people to think and act like computers.
July 25th, 2016
Like many of you, I grew up attending Sunday school. One of my memories of that experience involves a song that was a favorite among us kids: “Deep and Wide.” It consists of only a few words, sung over and over:
Deep and wide, deep and wide,
There’s a fountain flowing deep and wide.
What we loved about the song was not the words, which we didn’t understand (I still don’t), but the hand motions that went with them. For “deep,” we would hold our arms out in front of us with one hand extended high and the other low. For “wide,” we would extend our arms to the sides as far as we could reach, hoping to smack the kids to our left and right. That joke never got old.
These words came flooding back into my memory today as I was thinking about the need in data sensemaking to dig deep into data but also to explore data broadly. Deep and wide, focus and context, detail and summary, trees and forest are all expressions that capture these two fundamental perspectives from which we should view our data if we wish to understand it. Errors are routinely made when we dig into a specific issue and form judgments without understanding it in context. Exploring data from every possible angle provides the context that’s necessary to understand the details. It keeps us from getting lost among the trees, wandering from one false conclusion to another, fools rushing in and rushing out, never really knowing where we’ve been.
July 18th, 2016
Expertise isn’t what it used to be. Beginning with the industrial revolution and continuing into our modern information age, new technologies have altered our view of expertise and influenced the degree to which we pursue it. Technologies, properly understood, are tools that we humans create to augment our abilities. Good technologies are created and used by experts to extend their skills, not to replace them.
This relationship between experts and the technologies that they use was much clearer before the industrial revolution. Skilled craftspeople—carpenters, blacksmiths, cooks, farmers, and even accountants—cherished their tools and used them well. They understood which tools to use, when to use them, and how to use them productively. Their tools were a natural extension of their minds and bodies. Since the beginning of the industrial revolution, however, many tasks that were performed by humans in the past are now performed by machines. This is a mixed blessing. Some tasks can be performed better by machines, such as fast mathematical calculations done by an abacus, calculator, or computer. Some tasks that can be done better by people, can be done by machines more cheaply, so we sometimes sacrifice quality for affordability, such as when we buy a piece of manufactured furniture rather than paying a skilled craftsperson to build something better. Some tasks, however, can only be performed by humans and can at best be augmented by technologies. Data sensemaking and communication fall into this category.
What happens when technologies are used to do what only humans can do well? The outcomes are poor in comparison and people are discouraged from fully developing those skills. Similarly, what happens when the tools of a trade are designed by people who don’t understand the trade? Again, the outcomes suffer and people with expertise are frustrated in their efforts. If you were a warrior of bygone days preparing for battle, would you buy a sword made by someone who didn’t intimately understand its use in battle? Not if you wanted to survive. Most data sensemaking and communication tools are dull blades with slippery hilts.
My field of data visualization—the use of visual representations to explore, make sense of, and communicate quantitative data—falls into the broader category of knowledge work. Expertise in knowledge work can be difficult to assess. This is different from expertise playing the violin or performing gymnastics. Over hundreds of years, clear standards and measures of musical and athletic performance have been established, along with clear methods for developing expertise under the guidance of teachers, mentors, and coaches. Unlike these areas, methods for developing skill in data visualization are not firmly established. The field of data visualization is chaotic. We can’t even agree on a definition of the term, let alone determine what qualifies as expertise and the path to developing it. In my own mind and work, however, the field is clearly defined and the principles and practices, although neither complete nor fully formed, are firmly rooted in science and years of practical experience.
During the 20th century and so far in the 21st, we have watched in amazement as musicians and athletes have achieved what was previously thought impossible. Expertise in these realms has increased as each new generation built on the foundation of its forebears, coaxing their brains and bodies to reach new heights through increasingly advanced training regimens. In the 1908 Summer Olympics, a diver barely averted disaster when he attempted a double somersault, which was considered too dangerous, prompting recommendations that it be banned from competition. Today, the double somersault is an entry-level dive. Ten year olds can perform it perfectly and in high school the best divers are doing four and a half somersaults. Sadly, similar advancement is not happening among most knowledge workers. Are data sensemakers and communicators more skilled on average today than they were 50 years ago? I doubt it. In fact, it’s entirely possible that expertise in this realm has declined as technologies have displaced and discouraged the skilled efforts of humans.
The web has contributed to the problem. Despite its many benefits, the web has provided a convenient platform for inflated claims of expertise. In data visualization, the actual number of experts is but a small fraction of those who boldly make the claim in blogs. And now, as traditional book publishers are scrambling to remain viable, they eagerly offer book contracts to any blogger with a modest following. You can get a book published without first developing expertise in the subject matter. The book Data Visualization for Dummies by Mico Yuk is a vivid example. Apparently Wiley Press forgot that “for dummies” wasn’t meant to be taken literally.
Don’t claim expertise that you don’t possess. Never inflate your abilities. False claims do harm, even to yourself. If you believe that you’ve already reached the heights of achievement, you’ll spend your time demonstrating and proclaiming your minor achievements, rather than working to improve. You can only evaluate your own expertise by comparing yourself to experts, not to others with superficial knowledge and skills on par with your own. Forming a mutual admiration society built on mediocrity might feel good to its members, but it isn’t progress.
What’s especially disheartening about the current lack of expertise in data visualization is the fact that expertise is within reach. Expertise is not an exclusive club of the uniquely talented. We can all develop deep expertise in a chosen field. In every field of pursuit, we develop expertise in the same way: through a great deal of study and practice. The only natural talent that’s needed for developing expertise is one that we all possess: highly adaptable brains and bodies. We humans are the animal that learns. But if we trade this evolutionary advantage to instead become the animal that is shaped and limited by its tools, we won’t survive for long.
I recently wrote about deep work, the focused activity that is needed to perform at optimal levels. Now I’m talking about a kindred process, deep learning, which is needed to develop expertise. A wonderful new book titled Peak: Secrets from the New Science of Expertise, was recently written by one of the world’s great authorities on the topic, Anders Ericsson, with the help of science writer Robert Pool.
Ericsson, a professor of psychology at Florida State University, has been researching expertise for over 30 years. In Peak, Ericsson explains what expertise is, both in practice and in terms of brain development, and describes the universal gold standard of study and practice that is needed to achieve it, which he calls deliberate practice. I won’t steal his thunder by revealing the content of the book, except to say that he dispels the magical thinking about shortcuts to expertise. Deliberate practice is hard work and it takes a great deal of time. Not all practice is productive, leading to increased expertise. Deliberate practice involves guidance and feedback from existing experts. It takes advantage of the profound adaptability of our brains and bodies when pushed beyond our comfort zones in the right way and to the right degree. To whet your appetite, here’s a brief excerpt from the book:
But we now understand that there’s no such thing as a predefined ability. The brain is adaptable, and training can create skills…that did not exist before. This is a game changer, because learning now becomes a way of creating abilities rather than of bringing people to the point where they can take advantage of their innate ones. In this new world it no longer makes sense to think of people as born with fixed reserves of potential: instead, potential is an expandable vessel, shaped by the various things we do throughout our lives. Learning isn’t a way of reaching one’s potential but rather a way of developing it.
Expertise is potentially available to anyone who will commit to a prolonged and disciplined process of deliberate practice.
When I wrote my blog piece titled “Data Visualization Lite” not long ago, some of you might have thought that I was stroking my own ego by expressing dissatisfaction with the accomplishments of newcomers to the field. That wasn’t the case. I am genuinely discouraged by the paucity of good infovis research, by the redundancy and errors of most recent books on data visualization practices, and by the mediocre design and functionality of data visualization tools. I want us to do better and I know that we can, but not by doing business as usual. The rapid rise in the popularity of data visualization, which began a little over a decade ago, has done more harm than good. Popularity often breeds mediocrity. We must stuff socks in the mouths of marketers and shift the message from the gospel of salvation through technologies to an emphasis on human skills.
I’ve dedicated my professional life to the development of these skills, both in myself and in others. I want the efforts of others to surpass my own. I want to be left in the dust, rather than constantly looking back and yelling “This way. Hurry up!” This will take deep learning that builds on the best work that’s been done so far. There are no shortcuts. Expertise is the result of hard work, and at times it is no more fun than practicing the violin for several hours every day, but it’s worth it.