Thanks for taking the time to read my thoughts about Visual Business Intelligence. This blog provides me (and others on occasion) with a venue for ideas and opinions that are either too urgent to wait for a full-blown article or too limited in length, scope, or development to require the larger venue. For a selection of articles, white papers, and books, please visit my library.

 

Business Intelligence Industry – Get to Know Your Real Customers

June 17th, 2010

The BI industry has always failed to understand and support its real customers. With few exceptions, BI product vendors and consultancies continue to be acquainted primarily with IT. This is a comfortable, compatible relationship, for BI and IT both tend to see the world from an engineering-oriented, techno-centric perspective. But the BI industry’s real customers are the folks who actually use BI tools to transform data into the meaningful information they need to make better decisions. Although some of these folks work in IT, most do not. Most are not software engineers. Most are not technologists. Most are people who have a job to do that requires an awareness of what’s going on and how they might influence it, which is primarily gleaned from data. To do this, they need tools that enlighten.

In the past, when the BI industry focused exclusively on building an infrastructure for decision support by developing technologies that acquire, improve, store, and dispense massive amounts of data at high speeds, it was perhaps legitimate to engage primarily with IT. Today, however, the BI industry can no longer sit comfortably in locked rooms filled with servers, discussing bits and bytes with their IT comrades. Most organizations that have purchased BI solutions now know that they need more than BI infrastructure—they need to make sense of all that data they’re collecting, most of which today serves as a massive paper weight. Unfortunately, the BI vendors that helped build the infrastructure can’t use the same perspective, knowledge, and skills that made them successful in the past to produce data sensemaking (analytics) and communication tools. They must now shift from an engineering-oriented, techno-centric mindset to one that is design-oriented and human-centric. They must venture into unfamiliar territory. If they don’t, they’ll be left behind. Unfortunately, most of the major BI players haven’t realized this yet. Before they can begin to make the shift, they must first wake up.

I was prompted to write these words when I read a recent blog post by Boris Evelson of Forrester Research entitled “BI vs. Analytics.” Despite my impassioned disagreement with Evelson several months ago when he attempted to list the features of “advanced data visualization solutions” without first developing an understanding of data visualization, I found myself shouting “Amen” when I read the first two sentences of his recent blog entry:

In my definition—and believe it, I am fighting and defending it every day—analytics has always been, and will always be part of BI.

Indeed it has, at least by definition. Unfortunately, only in recent years have a few vendors managed to make analytics a part of BI in terms of actual analytical functionality. As I continued to read Evelson’s blog, however, I soon stumbled over the following statement: “Today most of the top BI vendors do have…advanced analytics…functionality, so it’s really a commodity now.” Apparently Evelson and I still see things quite differently. Analytics are now being claimed but not actually supported by most BI vendors. What most of them call analytics is so far from actual data sensemaking, it would be amusing if it weren’t so tragic. Analytics is not and never will be a commodity (that is, a good “which is supplied without qualitative differentiation across a market,” according to Wikipedia).

Evelson is not unique as a BI industry thought leader who fails to understand analytics. Few BI industry analysts and thought leaders have ever actually done the work of a data analyst. They’ve written ETL code, they’ve planned and managed BI implementations, they’ve developed reports, they’ve developed BI methodologies and strategies, and they’ve learned the intricacies of BI technologies, but they’ve never actually dipped below the surface of data sensemaking. What I’m saying is that most of BI’s prominent voices have at best a vague understanding of analytics, so they’re not the people you ought to be listening to for insight and advice in this particular realm. Only a few new experts with actual experience in analytics have raised their voices within BI circles in recent years—people like Tom Davenport and Jeanne Harris, the authors of Competing on Analytics and Analytics at Work. Their efforts are complementing statisticians and information visualization experts to raise the banner of BI’s ultimate purpose: data sensemaking. These are the voices that must be raised to a higher volume than those of the past if BI hopes to fulfill its original promise and ultimate goal—helping organizations function more intelligently by basing their decisions on evidence contained in data. The opportunity is now; the door is open. Not everyone in the BI industry, however, will walk through it.

Take care,

Circle-Lust Continues

May 27th, 2010

This blog entry was written by Bryan Pierce of Perceptual Edge.

Last week Stephen published an article entitled, “Our Irresistible Fascination with All Things Circular,” which describes how people’s seemingly innate love for circles has led to the creation of many dysfunctional graphs, such as pie charts. Today, another example of a poorly designed circular graph came to our attention. A couple months ago, Sunlight Labs hosted a contest called “Design for America,” which asked designers to create displays of government information for the purpose of making “government data more accessible and comprehensible to the American public.” A couple days ago, they announced the winners. In the data visualization category there are plenty of examples of what not to do, the worst of which appears below.

This display is supposed to be used to compare the 2009 US Federal Contract Spending for several sectors to the amount of Media Coverage that those sectors received during the year. As you can see, the designers seem to have fallen into the same sort of circle-lust that Stephen wrote about last week. In this case, the circular shape seems to be entirely arbitrary, because the quantitative data is encoded only by the thickness of the rings. These circles serve the same purpose as stacked-bar graphs; they’ve just been stretched out and distorted into a circular shape.

Ignoring the uselessness of the circular design for a moment, what does this visualization tell us? The only thing it tells me is that Defense spending was vastly under-reported in the media during 2009 while Health and Energy spending were comparatively over-reported. Without a lot of effort, I can’t make meaningful comparisons between the information in the other sectors, because they’re too small and hard to see, and I can’t even make comparisons between the three largest sectors with much accuracy. It’s also difficult to read the names of the smaller sectors because they overlap.

Although it might not be as sexy, two horizontal bar graphs next to one another would work better: one for Federal Contract Spending and one for Media Coverage. The Federal Contract Spending graph could be sorted from highest to lowest and the Media Coverage graph could present the bars in the same order. This would make it very easy to compare a sector’s spending and media coverage (because they’d be aligned in a row), it would make exceptions jump out (because there’d be a difference in the length of the bar in the Media Coverage graph compared to its neighboring bars), and it would be easy to read the names of all the sectors. It would still be hard to decode the contract spending in some of the smaller sectors accurately (because their bars would be so much smaller than the Dept. of Defense bar), but at least all of the bars would share a labeled quantitative scale, which would make the task easier.

Another useful alternative, which would put even more focus onto the relationship between Federal Contract Spending and Media Coverage, while making the exceptions jump out, would be a scatterplot that displayed Federal Contract Spending on the x-axis and Media Coverage on the y-axis.

It is unfortunate that most of the winners of Design for America contest don’t represent useful designs. The fact that the circular graph above was a winner either means that the judges of the contest had a terrible selection of designs to choose from, or that the judges don’t understand data visualization. This is sad, not just because people are being given $5,000 prizes for impoverished displays, but because this information is important and it could benefit people if it was presented in a useful way.

-Bryan

BP Oil Collection – Is the Effort Really Improving?

May 26th, 2010

A colleague sent me a link to Rachel Maddow’s website today where she features a graph that was used by BP senior vice president Kent Wells to show how the company’s efforts to collect the oil that’s spewing into the ocean at a rate of several thousands of barrels per day is improving. He talks about adjustments that they’ve made to the siphon, then says “Here you can see how we’ve continued to ramp up.” But is this really what’s happening?

Although the graph doesn’t outright lie, BP is relying on the viewer’s assumption that a series of bars that increases in height represents an increase in performance. In this case it does not, however, because the bars display the cumulative amount of oil collected per day, not the daily amount. In my graph below, which shows daily oil collection, the story is obviously quite different.

While the amount of collection increased in the beginning, it has decreased or held steady for the last four days and is now well below the average amount of daily collection for this period as a whole. Things are definitely not getting better. How do you spin bad news like this? One way is to create a misleading graph, but cover your ass by doing it in a way that isn’t an outright lie.

Take care,

Oracle—Have you no shame?

April 29th, 2010

Oracle Corporation takes its name from the treasured advice-givers of ancient Greece. Its name is ironic at times, however, when its advice is far from sage. When it comes to data visualization, and dashboard design in particular, Oracle gives some downright awful advice.

I received an email from one of my readers who uses Oracle’s OBIEE tool to develop applications for his customers. He attached the following graph as an example of what Oracle teaches people when they attend the online course “Oracle BI Enterprise Edition – Build Good Dashboards”:

Based on this graph, I’m guessing that Oracle now outsources the development of its courses to the primate house of the local zoo. Although I haven’t seen the course myself, I’m told that this graph is typical. If this is what a leading Business Intelligence software vendor considers an effective way to display data, it’s no wonder that people are frustrated with the industry.

Almost every aspect of this graph fails miserably.

  • It has been complicated by a 3-D rendering of the plot area and the bars (or tubes in this case), which does nothing but make it harder to interpret the values. Notice that the quantitative scale for “Dollars” and “Year Ago Dollars” on the left axis is aligned with the front of the graph, but the scale for “Forecasted Dollars” and “Forecasted units” on the right is aligned with the back of the graph.
  • I assume that the quantitative scale on the left is for dollars and the one on the right is for units. If this is the case, however, the title that appears on the right-”Forecasted Dollars, Forecasted Units”-is incorrect.
  • A dual-scaled graph—one quantitative scale on the left for dollars and one on the right for units—should usually be avoided, especially on dashboards, because it can be confusing and misleading. For example, notice that the forecasted units line intersects the dollars’ bars, which would naturally incline anyone viewing the graph to compare their magnitudes, yet this would be entirely meaningless, because magnitude comparisons can’t be made between them when they have entirely different scales and units of measure.
  • Forecasted units would not be useful on this graph without including actual units, which has apparently been forgotten.
  • Lines representing “Forecasted Dollars” and “Forecasted Units” have been used to connect values per region, which makes no sense. The patterns formed by the lines are completely arbitrary and could be changed by sorting the regions in a different order.
  • The lines have large, clutter-inducing data points along them.
  • The lines appear to have some sort of drop shadow or lighting effect, which makes it look as if there are four lines rather than two.
  • “Dollars” for the current year and “Year Ago Dollars” are meant to be compared, not summed. By using stacked bars rather than placing separate bars side by side for the current year’s dollars and the previous year’s dollars, the comparison is difficult to make. The bars as a whole, consisting of both years stacked on one another, represent a sum that is useless in this situation.
  • Given the fact that the X-axis has the title “Region”, there is no reason to clutter the graph by including “REGION” in each of the labels.
  • The prominent vertical grid lines that separate the regions are unnecessary, resulting in clutter.
  • The tick marks along both vertical axes are unnecessary, because gridlines appear in the graph at the same positions.
  • The minor tick marks on the right-hand vertical axes are darker than the major tick marks.
  • The positions of the two Y-axis titles are inconsistent, resulting in a sloppy appearance.

It is as if the person who created this “Good Dashboards” example of a graph did everything possible to make it as ineffective as possible.

How can a vendor that claims to understand data and presumes to teach people best practices in its use know so little? Oracle, you should be embarrassed.

Take care,

The Unprecedented is Overrated

April 12th, 2010

I was invited to speak at a recent TEDx event in Berkeley, but I withdrew late in the game when the TED folks asked me to sign a contract that would have given them the right to edit my talk however they wished without my permission. This is something that I never allow, because I’ve learned the hard way that even people with good intentions can screw things up by making bad edits. I’m writing today, not to talk about the rights of content creators to their work, but about the theme of this TEDx event, which struck me as misguided. I and the other speakers were asked to tie our talks to the theme ”Doing the Unprecedented.” When I received this request from the event coordinator (TED calls them “curators”), I told her that I would tie my talk to this theme by making the case that doing the unprecedented is highly overrated.

Most of what we can do to make the world a better place involves, not doing the unprecedented, but doing what matters and what works, whether unprecedented or not. This might not be as exciting as the unprecedented, but it’s desperately needed. I believe that too many opportunities are wasted because we glorify the unprecedented for its own sake.

In the United States over 150,000 people die each year due to post-surgical complications. That’s three times the number of traffic fatalities. What makes this even more shocking, however, is the fact that half of these post-surgical deaths could have been prevented, not by doing the unprecedented, but by doing what medical professionals already know, but often fail to do.

Many of these surgical failures are caused by the complexity of the work. When tasks are complex and you’re working under stress amidst distractions, it’s hard to remember everything that should be done. A movement is now underway to solve this problem, which involves nothing unprecedented, but something that another highly skilled group of professionals—pilots—have been doing for many years. Surgical teams are beginning to use checklists.

Atul Gawande, who led the effort to create this surgical safety checklist for the World Health Organization, writes convincingly about the need for checklists in all professions that deal with complexity in his new book The Checklist Manifesto: How to Get Things Right.

In the field of data visualization, failures are more common today than successes, not due to complexity, but to the fact that few people have been trained in the simple principles and practices of graph design. As a result, they rely on software tools to do the work for them and most of those tools lead them astray, encouraging them to produce silly, useless displays like this.

This is a travesty, because we are living at a time when we could be making tremendous use of data to inform better decisions, and most of the rules for doing this well have been known for years.

Here’s an example of one of the earliest quantitative graphs, hand drawn by William Playfair in 1786. In his time, Playfair did the unprecedented by inventing or greatly improving many of the quantitative graphs that we use today.

Back in 1983, Edward Tufte published his first book, The Visual Display of Quantitative Information, in response to the problem of ineffectively designed graphs. And yet, despite Tufte’s efforts, plus my own and the work of several others since, it appears that graphical communication skills in general might actually be declining. Problems like this silly pie chart on Fox News, which adds up to 193%, are far too common.

When did we lose sight of the fact that data displays are about data, expressed clearly, accurately, simply, and meaningfully? When did Business Intelligence (BI) take a wrong turn down the path to business stupidity? In our efforts to do the unprecedented, to make ourselves look impressive by decorating our data in impoverishing ways, we’ve adopted practices that make us dumb. Most of the principles for doing this right have been known for a long time. Let’s save the unprecedented for situations that demand it. For most data sense-making and presentation, let’s do what’s needed and what works.

Take care,