Thanks for taking the time to read my thoughts about Visual Business Intelligence. This blog provides me (and others on occasion) with a venue for ideas and opinions that are either too urgent to wait for a full-blown article or too limited in length, scope, or development to require the larger venue. For a selection of articles, white papers, and books, please visit my library.

 

Gartner’s Annual Magic Show

March 3rd, 2017

Even though I’ve questioned the usefulness and integrity of Gartner’s Magic Quadrant many times when it’s been applied to products related to analytics and data visualization, I’ve recently realized that there’s at least one aspect of the Magic Quadrant for which we should be grateful: the honesty of its name. By calling the quadrant “magic,” Gartner helpfully hints that it should not be taken seriously—it’s magical. We should approach it as we would the performance of a stage magician. When reading it, we should suspend disbelief and simply enjoy the ruse. Gartner’s Magic Quadrant is an act of misdirection, sleight of hand, smoke and mirrors. Understood as such, it’s grand entertainment.

Gartner recently published the 2017 edition of its “Magic Quadrant for Business Intelligence and Analytics Platforms.” As in past years, it is not a valid assessment of the products and vendors. Unfortunately, however, it will nevertheless be used by many organizations to select products and vendors. There is a dirty little secret about the Magic Quadrant that most industry leaders won’t publicly admit: few of them, including the vendors themselves, take the Magic Quadrant seriously as a valid assessment. They laugh about it in whispers and behind closed doors. They do take it seriously, however, as a report that exercises an undeserved degree of influence. The Magic Quadrant is a highly subjective evaluation of products and vendors that reflects the interests of the vendors that appear on it and Gartner’s interests as well, for Gartner is paid dearly by those vendors. Gartner’s coverage is about as “fair and balanced” as Fox News.

Although it should always have been included in any serious review of Business Intelligence tools, it took Gartner several years to shift the focus of its Magic Quadrant for Business Intelligence to one that supposedly embraces the importance of analytics (i.e., data sensemaking)—a transition that Gartner now claims is complete. Unfortunately, Gartner does not understand analytics very well, so it bases its evaluation on criteria that reflect the interests and activities of the vendors rather than a clear understanding of analytical work and its requirements. The criteria largely focus on technical features rather than on the fundamental needs of data analysts, and many of the features on which it bases its assessment are distractions at best, and, in some cases, recipes for disaster.

I won’t take the time to critique this year’s Magic Quadrant in detail, but will instead highlight a few of its flaws.

The Magic Quadrant displays its findings in a scatterplot that has been divided into four equal regions: “Niche Players,” “Challengers,” “Visionaries,” and “Leaders.” As with all scatterplots, a quantitative scale is associated with each of the axes: “Completeness of Vision” on the X-axis and “Ability to Execute” on the Y-axis.

Magic Quadrant

The actual measures that have been assigned to each vendor for these two variables are not shown, however, nor are the underlying measures that were combined to come up with these high-level measures. Gartner is not willing to share this data, so we have no way to assess the merits of the results that appear in the Magic Quadrant. Even if we could see the data, the Magic Quadrant would be of little use, though, for it doesn’t measure the most important qualities of BI and analytics products, nor is it based on data that is capable of assessing the merits of these products. We cannot actually measure a vendor’s ability to execute or its completeness of vision. Gartner’s conclusion that the vendors with the most complete visions are Microsoft, followed at some distance by Salesforce and the ClearStory Data, is laughable. The visions that place these vendors at the forefront in the Magic Quadrant will not lead to improved data sensemaking. Something is definitely wrong with the way that vision is being measured.

If I decided to use a scatterplot to provide a summary assessment of these products, I would probably associate “Usefulness” with one axis and “Effectiveness” with the other. What matters most is that the tools that we use for data sensemaking provide the functionality that is most useful and do so in a way that works well.

The Magic Quadrant is almost entirely based on responses to questionnaires that are completed by the vendors themselves and by those who use their products. This is not the basis for a meaningful evaluation. It is roughly equal to evaluating Mr. Trump’s performance by asking only for his opinion and that of those who voted for him. The degree of bias that is built into this approach is enormous. Obviously, we cannot trust what vendors say about themselves, nor can we trust the opinions of those who use their products, for they will almost always be biased in favor of the tools that they selected and use and will lack direct knowledge of other tools. The best way to evaluate these products would involve a small team of experts using a good, consistent set of criteria to review and test each product as objectively as possible. Questionnaires completed by those who routinely use the products could be used only to alert the experts to particular flaws and merits that might not be obvious without extensive use. Why doesn’t Gartner evaluate the field of vendors and products in this manner? Because it would involve a great deal more work and require a team of people with deep expertise acquired through many years of doing the actual work of data sensemaking.

Immediately following a two-sentence “Summary” at the beginning of the report, Gartner lists its “Strategic Planning Assumptions,” which are in fact a set of six prognostications for the near future. Calling them assumptions lends credence that these guesses don’t deserve. They are not predictions based on solid evidence, but seem like more of a wish list of sorts. Let’s take a look at the list.

By 2020, smart, governed, Hadoop/Spark-, search- and visual based data discover capabilities will converge into a single set of next-generation data discovery capabilities as components of modern BI and analytics platforms.  

At least one member of Gartner’s team of BI analyst must have a marketing background. This is meaningless drivel that can neither be confirmed nor denied in 2020.

By 2021, the number of users of modern BI and analytics platforms that are differentiated by smart data discovery capabilities will grow at twice the rate of those that are not, and will deliver twice the business value.

For some unknown reason, this prediction will take a year longer than the others to be realized. What are these “smart data discovery capabilities?”

Smart data discovery — introduced by IBM Watson Analytics and BeyondCore (acquired by Salesforce as of September 2016) — leverages machine learning to automate the analytics workflow (from preparing and exploring data to sharing insights and explaining findings). Natural-language processing (NLP), natural-language query (NLQ) and natural-language generation (NLG) for text- and voice-based interaction and narration of the most statistically important findings in the user context are key capabilities of smart data discovery.

First off, I hope this doesn’t come true because these so-called “smart data discovery capabilities” are almost entirely hokum. Relinquishing control of data sensemaking to algorithms will be the death of meaningful and useful analytics. Regardless, there is no actual way to confirm if those who use these capabilities will “grow at twice the rate” of those who don’t, and there certainly isn’t a way to measure a two-fold increase in business value. Even if they defined what they mean by these measures, they wouldn’t have a way to gather the data.

By 2020, natural-language generation and artificial intelligence will be a standard feature of 90% of modern BI platforms.

This is somewhat redundant because Gartner defines smart data discovery, addressed in the previous prediction, as products that incorporate machine learning and natural language processing. I’m assuming that by “artificial intelligence” Gartner is actually referring to machine learning algorithms, because none of these products will incorporate true AI by 2020, and probably never will.

By 2020, 50% of analytic queries will be generated using search, natural-language processing or voice, or will be autogenerated.

According to this, by 2020 the 90% of BI products that incorporate natural language processing will be used to generate 50% of all queries through natural language interfaces. That sounds cool, but isn’t. Natural language is not an efficient way to generate data sensemaking queries. Anyone who knows what they’re doing will prefer to use well-designed interfaces that allow them to directly manipulate information and objects on the screen rather than using words.

 By 2020, organizations that offer users access to a curated catalog of internal and external data will realize twice the business value from analytics investments than those that do not.

What do they mean by a “curated catalog?” Here’s the closest that they come to a definition:

A curated agile data catalog where business users can search, access, find and rate certified internal data as well as open and premium external data with workflow — in order to promote harmonized data to certified status — is becoming key to governed modern deployments leveraging complex distributed data with an increasing number of distributed content authors.

This is mostly gobbledygook. Without a clear idea of what this is, this prediction can never be confirmed, and even if the meaning were clear, we would not be able to determine if these features led to “twice the business value.”

The sixth and final prediction is one of my favorites:

Through 2020, the number of citizen data scientists will grow five times faster than the number of data scientists.

As I’ve written before, there is no science of data. The term data scientist is a misnomer. Even if this were not the case, there is no commonly accepted definition of the term, so this prediction is meaningless. Now, add to this a new term that is even more meaningless—“citizen data scientist”—and we have the makings of complete nonsense. And finally, if in 2020 you can demonstrate that so-called citizen data scientists grew five times faster than the number of so-called data scientists, I’ll give you my house.

It’s ironic that Gartner makes such unintelligent statements about business intelligence and such unanalytical statements about analytics and then expects us to trust them. Unfortunately, this irony is missed by most of the folks who rely on Gartner’s advice.

Take care,

Signature

Tell Me a Story, or Not

February 6th, 2017

I began talking about finding and then telling the stories that reside in data 13 years ago, several years before “data storytelling” became a common expression and popular pursuit. Mostly, I was speaking of stories metaphorically. In some respects, I regret using this expression because, like many metaphors, its use has become overblown and misleading. I did not mean to suggest that stories literally reside in data, or, if I did, I was mistaken. Rather, facts reside in data from which stories can sometimes be woven. Literally speaking, storytelling involves a narrative presentation that consists of a beginning, middle, and end, along with characters, plots, and often dramatic tension. Data does not tell stories, people do.

Don’t be misled: data storytelling (i.e., the presentation of data in narrative form) makes up a tiny fraction of data visualization. The vast majority of data visualizations that we create present facts without weaving them into stories. Relatively few of the facts that we display in data visualizations lend themselves to storytelling. I’m not diminishing the usefulness of data storytelling, which can be incredibly powerful when appropriate and done well. I’m merely pointing out that data storytelling is not some new endeavor or skillset that dominates data visualization. It is a minor—but nonetheless important and useful—aspect of data visualization. Not everyone who works in the field of data visualization must be a skilled storyteller. In general, it’s more valuable to be skilled in data sensemaking and graphicacy, as well as a clear thinker and communicator, and to possess knowledge of the data domain.

When facts can indeed be woven into a story, however, do so if you know how. We love stories. They can breathe life into data. Just don’t try to impose a story on a set of facts to create life where it doesn’t exist.

Take care,

Signature

Your Attention Has Been Traded in a Faustian Bargain

January 25th, 2017

Tim Wu, the professor at Columbia Law School who coined the term “net neutrality,” is the author of an important and extraordinarily well-researched and well-written new book titled The Attention Merchants: The Epic Scramble to Get Inside Our Heads. In it, Wu traces the entire history of the technology-enabled work of merchants—those who wish to sell us something—to dominate our attention and, in so doing, to create demand for their products and services.

The Attention Merchants

From the snake-oil salesmen of old to the more pervasive, less noticeable, and more effective methods that define today’s Web experience, this book takes us on a comprehensive and insightful journey through the entire history of advertising and explains how the efforts of attention merchants have not only created demands that artificially dominate our lives, but have also gotten into our heads in ways that have fundamentally changed who we are. Even though Wu’s concerned that attention-grabbing technologies have had ill effects, this book is not a screed. It exudes the even-handed tone of a scholar. Wu lays out the facts without preaching, but his concern is nonetheless evident and pressing.

A good life and the sensible decisions that enable it are hindered by the onslaught of distractions that dominate most of our attention today. The constant tug of social media, ubiquitously accompanied by increasingly targeted advertising content, leaves little room for reflection. It exercises its influence largely at an unconscious level.

As the world fills to overflowing with unremitting noise, our lives are impoverished. We have traded our attention for hollow promises of useful content and experiences: a Faustian bargain. The dominant attention merchants of our day, Web-based services such as Facebook and Google, have a self-serving agenda that is much different from ours, but that isn’t obvious without finding a moment of stillness in the eye of the storm. Several authors have written about the battle for our attention that is being waged against us today, including several whose books I’ve reviewed in this blog. Tim Wu has added significantly to this lexicon by telling the chilling story of attention merchants.

We can choose to opt-out of this Faustian bargain that we’ve inadvertently made, but it isn’t easy. By reading The Attention Merchants, you will learn about the main forces that have created this problem—all familiar names—and this knowledge will equip you for battle.

Take care,

Signature

There Is No Science of Data

January 23rd, 2017

“Data Science” is a misnomer. Science, in general, is a set of methods for learning about the world. Specific sciences are the application of these methods to particular areas of study. Physics is a science: it is the study of physical phenomena. Psychology is a science: it is the study of the psyche (i.e., the human mind). There is no science of data.

Data is a collection of facts. Data, in general, is not the subject of study. Data about something in particular, such as physical phenomena or the human mind, provide the content of study. To call oneself a “data scientist” makes no sense. One cannot study data in general. One can only study data about something in particular.

Most people who call themselves data scientists are rarely involved in science at all. Instead, their work primarily involves mathematics, and usually the branch of mathematics called statistics. They are statisticians or mathematicians, not data scientists. A few years ago, Hal Varian of Google declared that “statistician” had become the sexy job of our data-focused age. Apparently, Varian’s invitation to hold up their heads in pride was not enough for some statisticians, so they coined a new term. When something loses its luster, what do you do? Some choose to give it a new name. Thus, statisticians become data scientists and data becomes “Big Data.” New names, in and of themselves, change nothing but perception; nothing of substance is gained. Only by learning to engage in data sensemaking well will we do good for the world. Only by doing actual good for the world will we find contentment.

So, you might be wondering why anyone should care if statisticians choose to call themselves data scientists, a nonsensical name. I care because people who strive to make sense of data should, more than most, be sensitive to the deafening noise that currently makes the knowledge that resides in data so difficult to find. The term “data scientist” is just another example of noise. It adds confusion to an overly and increasingly complicated world.

Signature

P.S. I realize that the term “data science” is only one of many misnomers that confuse the realm of data sensemaking. I myself am guilty of using another: “business intelligence.” This term is a misnomer (and an oxymoron as well) in that, as with data science, when it is practiced effectively, business intelligence is little more than another name for statistics. It has rarely been practiced effectively, however. Most of the work and products that bear the name business intelligence have delivered overwhelming mounds of data that is almost entirely noise.

The Least Amount of Information

January 12th, 2017

As its default mode of operation, the human brain uses the least amount of information necessary to make sense of the world before making decisions. This product of evolution was an efficient and effective strategy when we lived in a simple, familiar world. We no longer live in that world. We can still use this strategy to make sense of those aspects of our world that remain relatively simple and familiar (i.e., walking from point A to point B without tripping or falling into a hole), but we must use more advanced strategies when navigating the complex and/or unfamiliar. The default mode of thinking, which is intuitive, feeling-based, and fast, utilizing efficient heuristics (rules of thumb), is called System 1 thinking. The more advanced and more recently evolved mode of thinking, which is reflective, rational, and slow, is called System 2 thinking. Both are valid and useful. The trick is knowing when to shift from System 1 to System 2.

In my opinion, many of the problems that we suffer from today occur because we fail to shift from System 1 to System 2 when needed. For instance, electing the president of the most powerful nation on Earth requires System 2 thinking. That’s obvious, I suppose, but even such a mundane task as grocery shopping requires System 2 thinking to avoid choices that are fueled merely by marketing.

Defaults are automatic and largely unconscious. A single mode-of-thinking default doesn’t work when life sometimes requires System 1 and at other times requires System 2. Instead, rather than a default mode of thinking, we would benefit from a default of shifting into one or the other mode depending on the situation. This default doesn’t exist, but it could be developed, to an extent, through a great deal of practice over a great deal of time. Only by repeating the conscious act of shifting from System 1 to System 2, when necessary, over and over again, will we eventually reach the point where the shift will become automatic.

For now, we can learn to bring our mode of thinking when making decisions into conscious awareness and create the moments that are necessary to effect the System 1 to System 2 shift when it’s needed. Otherwise, we will remain the victims of hunter-gatherer thinking in a modern world that demands complex and sometimes unfamiliar choices, many of which come with significant, potentially harmful consequences. How do we make this happen? This is a question that deserves careful (i.e., System 2) study. One thing I can say for sure, however, is that we can learn to pause. The simple act of stopping and taking a moment to ask, “Is this one of those situations that, because it is complex or unfamiliar, requires reflection?”, is a good start.

Take care,

Signature