Thanks for taking the time to read my thoughts about Visual Business Intelligence. This blog provides me (and others on occasion) with a venue for ideas and opinions that are either too urgent to wait for a full-blown article or too limited in length, scope, or development to require the larger venue. For a selection of articles, white papers, and books, please visit my library.

 

Business Objects Insight – A mind grind and waste of time

May 21st, 2007

You’ve got to hand it to the marketing folks at Business Objects: they’ve got balls. They don’t hesitate to make claims that are backed up by nothing but illusion. With the introduction of their new website called Business Objects Insight, however, they’ve taken marketing chutzpah to a whole new level. Want to solve the world’s great problems? Welcome to Business Objects Insight, the “world’s first mind grid,” the only site that provides “tools for data visualization, data collaboration, and a platform to publish challenges to the online community.” The challenges take on great problems of the modern world, such as global warming. Ignoring the fact that they are not the only site that does this (I’ll tell you about Many Eye’s in a moment), let’s look at what they’re actually providing.

Data visualization: What they call data visualization is really just Crystal Xcelsius, their product that makes the analysis and presentation of data look like a video game and work about as effectively as a eunuch in heat.

Data collaboration: I can’t tell that any collaborative functionality has been built into the site, other than a blog and the fact that people can display their Xcelsius applications there and others can look at and use them. As far as data collaboration goes, this is rather anemic.

Platform for challenges: This isn’t really a feature; it’s the declared purpose of the site. Participants are being challenged to develop data visualization’s using Xcelsius that are designed to solve major world problems. And why should people make the effort to save the world and why should they channel their world-saving talent into learning and using Xcelsius to do so? Because Business Objects is going to pay a heart-stopping million dollars to the creators of the best world-saving applications (or actually “up to a million dollars”, which, if you think about it could actually mean nothing at all).

This strikes me as a thinly-veiled marketing scheme to sell more copies of Xcelsius under the guise of solving world problems. Business Objects’ founder and Chairman Bernard Liautard declares:

Today the world becomes more intelligent. While there are a number of sites dedicated to aggregating and analyzing data, Insight is unique in providing members with tools for data visualization, data collaboration, and a platform to publish challenges to the online community. Our goal is to change the way problems get solved, to work on issues that have a global impact, and to challenge the conventions and paradigms of online communities.

Wow, this is quite a claim. If only Business Objects had the know-how and technology to do it. Until they actually develop or hire some expertise in the field of data visualization, they should stop claiming that they are using visualization methods to tackle even the simplest problems, let alone the great problems that plague our world. And until they have tools that provide effective visualization functionality, rather than the child’s toy of a product called Xcelsius, they should stick to selling data reporting tools that depend on the conventional paradigm of purely text-based displays.

If you’re interested in seeing a site that effectively uses data visualization as a means for people to exchange information and insights related to world problems, and does so in a way that supports true collaboration, take a look at Many Eyes, which was developed by IBM Research. The reason this site succeeds where Business Objects Insight does not is because it was designed by people who are experts in data visualization and data collaboration. Although the folks at Many Eyes are not making any grand claims about saving the world, they are providing a platform that could actually be used to support this effort.

What’s so sad about this is that there are real problems in the world that need solving, but Business Objects Insight, with its dysfunctional tools, will only waste people’s time, frittering away well-intentioned efforts and potentially good ideas that could be better applied elsewhere. If Business Objects really wants to help solve the problems of the world, why not throw their weight behind a data visualization and collaboration site that really works? Perhaps they have an ulterior goal.

Take care,

Signature

Business Object's Insight Screenshot

Dental work by road workers with jackhammers — Dashboard design gone awry

May 10th, 2007

Alright, I’m not really going to write about the having your dental work done by road workers with construction equipment, but I am going to write about something that is just as painful and absurd: information displays that are designed by software engineers who know nothing about design.

A few days ago a press release was published by the George S. May International Company to announce its dashboard solutions for “small to mid-sized” companies. Here’s a quote from the press release:

We have found that two major obstacles stand in the way of business owners managing their companies more effectively. One is the difficulty in understanding the data they have. The second is difficulty in determining the cause-and-effect relationships among the different data. Management Dashboards helps business owners overcome these obstacles.

While this accurately describes two common problems in business today, I don’t agree that the dashboards that George S. May offers do much to solve the problem. Like most dashboard providers, this company’s solutions communicate information poorly. Effective dashboards result from a combination of good technology and good design. These dashboards look like they’ve been designed by technologists who sit in their dimly lit cubicles all day banging out code, isolated from the world of people. Dashboards are a medium of communication. To work effectively, they must be designed to present the information that people need to do their jobs in a way that is clear, accurate, and efficient.

It makes me sad and even a little angry when software and service companies advertise information solutions that work this poorly, because it isn’t that difficult to learn how to do this right. To illustrate this point, I asked Bryan Pierce, the Operations Manager at Perceptual Edge who has been working with me since last December, to critique one of the dashboards that are featured by George S. May. Before coming to Perceptual Edge, Bryan had no experience with data visualization, and because his work doesn’t require him to be an expert in this field, what he has learned he has picked up mostly indirectly, by reviewing my articles, books, and blogs. In a short time, he has developed the skills that a company such as George S. May could use to produce solutions that really work. The rest of this blog entry was written by Bryan to illustrate how easily the visual design skills that are needed to dramatically improve dashboards can be learned.

Take care,

Signature

 


My name is Bryan Pierce. I am not a rocket scientist or a brain surgeon, nor do I need to be to understand and apply the principles of good information dashboard design. For the last six months, I have worked with Stephen Few at Perceptual Edge, handling the day-to-day operations. Using the skills I have picked up in that time, I am critiquing and providing recommendations for the improvement of a dashboard I recently found online, which was created by the George S. May International Company (http://www.gsmdashboards.com/): 

Prior to working with Stephen, I had no exposure to information dashboards; I wasn’t even familiar with the term. Just after Stephen offered me this job, I decided to read Information Dashboard Design so that I’d have a better understanding of Stephen’s work. Over the past few months, I’ve also read most of his articles and blog posts that address the subject. With the exception of a few conversations we’ve had on the subject, everything I know about effective dashboard design can be learned from Information Dashboard Design or http://www.perceptualedge.com/.

My discussion of this dashboard’s problems is broken into sections. First, I’ll discuss the overall problems, and then I’ll point out some of the problems that are specific to individual components.

Overall Problems:

  • Layout: Each component on the dashboard fits into an equally sized “box,” which scales when the window is resized. All of the components resize along with their containing boxes, except the table, which does not scale. Depending on your resolution, this can cause the dashboard to be unbalanced, as in the screenshot above, where some of the items are unnecessarily large, while the table is almost illegibly small. Besides poor scaling, the dashboard’s layout is hindered by the fact that it is based on a grid. As mentioned before, each component fits into an equally sized box, even though all components probably shouldn’t be equally sized. For instance, the heatmap (bottom left) has a much higher data density than the upper bar graph and should probably be allowed more space, yet the dashboard’s grid system gives them the same amount of “real estate.” More thought should have been given to the size and placement of each of these components, based on the nature of the data and its intended use.
  • Fill Color: Fill color is used to separate the dashboard into four sections. This makes it unnecessarily difficult for your eyes to track between the differently colored columns. In this case, white space alone would have probably been enough to delineate the sections (with a proper layout); if not, very thin, light gray lines could have been used. In the instances where it is necessary to use fill color to separate sections, a very light color is all that is needed.
  • Contrast: The contrast of the graphs compared to the background colors varies significantly. For instance, the heatmap and the gauge use bright colors on a dark background, so they are the most visually salient objects on the entire dashboard. But, are they really that much more important than everything else? Now look at the table and the upper bar graph. They both use blues that are very similar to the background color. As such, they fall away into the background. While a good design can use differences in contrast to direct our eyes, in this case, I think these differences are arbitrary.
  • Lack of Context: Many of the graphs are hard to decipher due to insufficient explanatory text. For instance, the “Average Loan Size” bar graph would be easier to understand if it said what units it was being measured in (e.g. U.S. dollars, thousands of U.S. Dollars, etc.). In some cases, such as in the pie charts, the missing information can be obtained from a pop-up legend, by clicking the small eyeball icon to the bottom-right of the charts. However, many of these graphs could have easily been put in context through clearer titles and labels, making the pop-up legend unnecessary. Also, even with the assistance of the legend, some of the graphs are still indecipherable. For instance, notice that both the gauge and the table display the “loan count,” but represent drastically different values. In context, it’s likely that both of these values would make sense. Unfortunately, that context has not been provided.
  • Vertically-Oriented and Angled Text: The line graph, the two bar graphs, and the Pareto chart (top right) use vertically-oriented text for their axis titles. The bar graphs and the Pareto chart also use angled text for some of their labels. Vertically-oriented or angled text is harder to read than horizontally-oriented text and should not be used if it can be avoided. On this dashboard, the vertical axis titles could easily be moved to the top of the axis and horizontally-oriented, while the labels could all be oriented horizontally without moving them at all (although some of the labels in the Pareto chart would need to be split onto two lines).
  • Unnecessary Precision: Graphs are used to show the shape of data, to compare magnitudes, spot exceptions, etc. If exact values are necessary, a table works best. As such, it’s usually not necessary to show actual values of bars; a scale along the axis will likely provide sufficient precision. In this dashboard, the numbers have been written directly on the bars and pie slices, and in many cases they have been written to two decimal places of precision. This clutters the graphs and distracts us from the shape of the data. On the rare occasions when the exact values and the shape or magnitude of the values are both necessary, a table and graph should be used in conjunction. It’s less distracting and more efficient to look up values on a table that is below or next to the graph than it is when they’re integrated.
  • Use of Color Gradients: It’s a rare occasion when the use of a gradient in a dashboard actually serves to enhance its usability. Most often, color gradients are used in a misguided attempt to make a dashboard more visually interesting. At best, this is useless decoration; at its worst (such as when a gradient is used in the plot area of a graph), it can actually cause optical illusions that can adversely affect perception of the data. In this dashboard, gradients are used to decorate many of the graph and axis titles. This does nothing to enhance communication and only serves to give these titles unnecessary visual salience.

The Heatmap:

  • A heatmap is a poor choice for the display of time-series data. In addition to the actual values, which a heatmap can only display in a very rudimentary manner, based on color, it’s often useful to see the shape of change through time. If a line graph were used instead of a heatmap, it would be much more enlightening. For instance, in June, the heatmap shows us that for all but one day, the amount of loans funded was considered “Poor.” If a line had been used, we could see whether the amount of funded loans is trending upwards, downwards, or remaining flat, whether the loan amount is fluctuating significantly between days or remaining fairly steady, etc. Reference lines could be used to signify the division between “Good,” “Fair,” and “Poor” performance.
  • The horizontal axis of the heatmap is inadequately labeled and very confusing. The axis label says that each number represents the “Date,” but last time I checked, February had more than 19 days. After working at it, I was able to decipher the meaning of the days. Each number represents a business day for a given month in the year 2005. In addition to ignoring the weekends, the heatmap also ignores certain holidays. For instance, February only had 19 work days if you exclude President’s Day. As you can see, the problem with this is that nobody thinks in terms of work days. You don’t think, “Today is the 13th work day of the month.” You think “Today is the 17th day of the month.” The heatmap’s design would have been much more effective if every day of the month was included and non-business days were simply left blank or “grayed out” in some manner.
  • Color is used poorly in the heatmap. The use of similar intensity reds and greens together makes the heatmap useless to the 10% of men and 1% of women who are colorblind. Additionally, by only encoding three different values (“Good,” “Fair,” and “Poor”) we lose out on some of the depth the heatmap could have provided. For instance, currently, a single loan could mean the difference between a day being considered Fair or Good. If a divergent color scale were used—that is, one that uses different intensities of two different colors, to encode the data—the heatmap would provide much more insight. For instance, red could be used for poor loan days and blue could be used for good loan days. Days with extremely high or low loan volumes would show up as bright red or blue, average days would appear gray, and everything else would fall somewhere in between. While we still wouldn’t know exact values (color doesn’t work for this), we would have a better idea of just how “good,” “poor,” or even “fair,” each day was.

The Gauge:

  • The primary problem with the gauge in the second column is, well, that it’s a gauge. Gauges are “all the rage” on information dashboards; unfortunately, they take the dashboard metaphor too far. One of the strengths of gauges on real automobile dashboards is that they are an easy way for a mechanical device to show change through motion. For the needle to change, its base only needs to rotate, instead of physically moving from one place to another. However, computerized dashboards need not share the same physical restraints that real world gauges do. On a computer screen, the circular shape of the gauge only wastes space and makes it unnecessarily difficult to read. Additionally, it’s rare that the data on a dashboard is updated so frequently that motion will actually be used to show change. This dashboard is no exception. The “real-time” gauge represents the “loan count” for a given month. You would want to know this number, but would you really watch the gauge to see how fast it changed, the way you look at the speedometer in your car? No. The information contained in the gauge could be displayed more efficiently in a variety of ways that would also save room. One of the best ways to display this information is through the use of a bullet graph, which Stephen developed specifically as an effective, compact replacement for gauges.

The Upper Bar Graph:

  • Can you name the two months that follow May? June and July, right? When you think of them, you never think “July and June,” because that is not their order in the year. Anytime time-series data is shown on a graph, it should always be sorted from earliest to latest and never any other way. Unfortunately, in an attempt to rank the months by “Average Loan Size,” the creators of this dashboard put July before June, as seen in the upper bar graph. They shouldn’t have.

The Pareto Chart:

  • The graph in the top right corner is called a Pareto chart. In this type of graph, the bars indicate the magnitude of individual items, while the line indicates the cumulative totals of those items from left to right. For instance, the line starts at the top of the first bar and then as we move to the second bar, the line displays the combined total of the first and second bars. Once we reach the right-most bar, the line equals the total of all items. The problem here is not the Pareto chart itself, but the scale on the left. The scale on the right expresses each bar’s ACH Pull as a percentage of the whole (which is why the line ends at 100% on this scale), but it is not clear what the left scale represents. Also, the unequal precision used on the left scale, with numbers containing anywhere from 0 to 2 decimal places, makes it more difficult to read than necessary.

The Pie Charts:

  • Pie charts should never be used. Research has found that people have a much harder time accurately judging 2-D areas, such as slices of a pie, than they have comparing lengths, such as the lengths of bars. These pie charts also exhibit another problem that most other pie charts do not. Because, both dollar values and percentages are provided on the pies, it’s natural to assume that comparisons can be made between the slices of two different pies. However, this can only be done with the percentages, not with the dollar values. For instance, look at the blue portion of the pies that represent February and March. Although February’s blue slice is larger and represents a larger percentage of that month’s total, the dollar value that it represents is significantly less than the dollar value for March. This problem and the problem of pie charts in general could be avoided if this was redesigned as a bar chart. Each pie could be replaced with a pair of bars, and all of the bars could be placed on a single axis. The vertical axis scale could be in dollars. This design would be more compact than six pie charts, and it would make comparisons between the magnitudes of two bars in a single month or between two bars in different months accurate and efficient. If the exact percentages were necessary—which they probably wouldn’t be, given the higher accuracy of magnitude comparisons with bar graphs—they could be included in a small table below the graph.

There are other tweaks and polishes that could be made to the dashboard; I have only discussed the most egregious problems. So, why do so many companies make dashboards that have so many design problems? I don’t know. Maybe they are unaware of the problems or perhaps they just don’t care. I can tell you, however, that it’s not difficult to learn the principles of good dashboard design. I have picked them up through a few hours of reading. Once they’re explained to you, they make sense and eventually seem intuitive. It only takes a little effort to learn how to create dashboards that are as effective as they should be.

The Ghost Map — Data visualization in the 19th century

April 29th, 2007

I recently finished reading a wonderful book by Steven Johnson entitled The Ghost Map: The Story of London’s Most Terrifying Epidemic – and How It Changed Science, Cities, and the Modern World. In the summer of 1854 cholera swept through a section of London with unprecedented intensity. At the time, the cause of cholera was unknown and rapidly growing modern cities such as London, with dense populations packed into small areas, were rich breeding grounds for this disease. Most of those who concerned themselves with disease and its cure held tightly to the miasma theory that cholera spread through the air and was associated with the bad smells and the unclean urban environments that produced them. In fact, cholera is a bacterium, which was spreading through the water supply. This book tells the story much as a journalist who witnessed it firsthand would do, but a journalist who had the advantage of hindsight informed by knowledge of modern medicine.

Several people of the time play important roles in this story – none more than John Snow, a medical doctor and research scientist. The ghost map refers to a map that he drew by hand during the process of his investigations, which could clearly demonstrate to anyone with open eyes that the source of the outbreak was the Broad Street well. Despite the evidence that this map displayed, however, the miasma theory of cholera transmission prevailed for several years after the epidemic. Eventually, due largely to the tenacious efforts of John Snow and an unlikely supporter, Reverend Henry Whitehead, the evidence won out and steps were taken to eliminate the conditions in which cholera could spread.

This story is important in the history of data visualization, because it is one of the earliest accounts of how a visual representation of important data was able to bring to light evidence that might have otherwise remained obscured for much longer if relegated to a tabular display. In this case, a picture (in the form of a map with quantitative data) was indeed worth a thousand words and helped to save many thousands of lives.

This is more than the story of a great map, however. It tackles larger issues, such as how new ideas and scientific discoveries become adopted, often against great resistance, even from the intellectuals of the day. John Snow and Henry Whitehead are great role models for all of us who care about discovering and communicating the truth, even when it is unpopular. I recommend this book highly.

Take care,

Signature

Ghost Map

WebCharts3D — Dysfunction at its finest

April 24th, 2007

Would you buy a pair of glasses with lenses that were so scratched up you couldn’t see through them, even if the frames looked cool? Not if you want to get from point A to point B without injury. So why would you ever buy charting software that transforms simple information into a completely unreadable display? Yesterday, GreenPoint, a self-proclaimed “leader in enterprise-wide visualization solutions,” issued a press release announcing the latest release of WebCharts3D. Even this product’s name advertises its dysfunction. Adding a third dimension of depth to bars, lines, and pies obscures the data. To this GreenPoint adds more dysfunction by making the objects in charts transparent (for example, see-through bars), resulting in a maze of lines and angles that must be unraveled to make sense of the data.

Try to decipher the patterns and values in the following chart. Come on, give it your best shot. Even if I offered a cash prize to anyone who managed to come close, it wouldn’t be worth your effort to try, because you’d be forced to use the prize money to pay a doctor to fix the damage done to your eyes.

Band Chart

Here are a few more examples:

Column Chart
Band Chart
Pyramid Chart

Disinformation in all shapes and sizes. If this is what you’re after, then don’t hesitate to buy this product. If, however, you want your charts to actually communicate information, look for a product that proudly advertises charts that are easy to read.

Just to be fair, most but not all of this product’s charts are transparent. For example, here’s a radar chart that you could use to compare the performance of three products across eight years of time. Did you know that time is circular and that in the year 2007 we have returned to where we began in 1999? Despite this revelation, I’m finding it hard to relinquish my notion that time is linear and my desire to see this information in a simple line graph.

Radar Chart

WebCharts3D is not alone in its ability to obscure otherwise clear and simple data, but when a product this bad issues a press release, it’s hard to ignore.

Take care,

Signature

P.S. For the benefit of Ryan, who has posted a response to this blog topic, and readers who wish to see a more comprehensive sample of the charts that are available in WebChart3D, here are all six versions of the Step Chart that appear in the sample gallery.

All Step Charts

“Sources of Power” in Data Visualization and Decision Making

April 19th, 2007

I have sometimes been amused during my attendance at high-level business meetings in American industry, amused at the discrepancy between the way we are told that important decisions get made and the truth. (Donald A. Norman, Things That Make Us Smart: Defending Human Attributes in the Age of the Machine, 1993, Basic Books, New York)

The logical-rational decision making models that we are taught in college are worthwhile and necessary, but they are rarely used in the course of everyday business. Although there are times when they ought to be used but aren’t, it is appropriate that they are used seldom. One reason for this is that these processes require a great deal of time — something we rarely have. The other reason is that if you have expertise in a domain, you are able to make decisions that are usually just as good, but require little time.

Gary Klein, Chief Scientist at Klein Associates, Inc., has written an informative book about decision making that approaches the topic quite differently from most other books and articles. In the introductory chapter of Sources of Power: How People Make Decisions (MIT Press, 1999) he sets the stage for his treatment of the topic as follows:

During the past twenty-five years, the field of decision making has concentrated on showing the limitations of decision makers — that is, that they are not very rational or competent. Books have been written documenting human limitations and suggesting remedies: training methods to help us think clearly, decision support systems to monitor and guide us, and expert systems that enable computers to make the decisions and avoid altogether the fallible humans.

This book was written to balance the others and takes a different perspective. Here I document human strengths and capabilities that typically have been downplayed or even ignored.

Klein has studied decision making for years and has filled the book with research findings of his own and others, along with story after story of decision making in action.

Despite the fact that we at times and for good reason use “deductive logical thinking, analysis of probabilities, and statistical methods” to inform decisions, Klein reports that in natural settings decisions are rarely analytical, but are usually informed by intuition, mental simulation, metaphor, and storytelling.

The power of intuition enables us to size up a situation quickly. The power of mental simulation lets us imagine how a course of action might be carried out. The power of metaphor lets us draw on our experience by suggesting parallels between the current situation and something else we come across. The power of storytelling helps us consolidate our experiences to make them available in the future, either to ourselves or others.

If you are new to a field, having not yet developed expertise in the domain, decision making must be informed by a relatively slow process of information gathering and evaluation. If you are an expert, however, this research-laden and brain-taxing process is necessary much less often. According to Klein, experts usually make decisions based on what he calls the “Recognition-Primed Decision Model” (RPD). It combines two processes: “the way decision makers size up the situation to recognize which course of action makes sense, and the way they evaluate that course of action by imagining it”. Experts can often look at a situation and quickly recognize it as familiar, knowing intuitively what’s going on and how to respond. When multiple courses of action seem possible, they can take the first from their immediately prioritized list and evaluate its merits by running a quick mental simulation. If problems are discovered during the mental simulation, they proceed to the next possible course of action, until they find one that works, and then without further delay, take action. Are they always right? No, but if they’re experts in the field, they’re right most of the time.

How does this relate to data visualization? People who analyze data, if they are experts in the domain, usually know what’s important and make sense of it based on pattern recognition. They’ve seen it before, or something similar. Nothing presents meaningful patterns that reside in data better than properly chosen and well designed visual representations. More than any other tool, data visualization software can support meaningful pattern recognition for a broad range of people and can enable them to clearly present what they’ve found to others. Good visual analysis software pares information down to its essence in the form of a picture, removing the noise to enable clear focus on the signal. It translates abstract patterns of meaning in the data into images that can be easily perceived by our eyes and discerned by our brains, thereby serving as an external tool of cognition.

Good decisions must be based on clear presentations of data that allow experts to bypass time-consuming and often unnecessary mental gymnastics and software mechanics so they can spend their precious resources assessing the situation and responding while there’s still time. If you don’t have good visual analysis and presentation tools to support this process, you’re wasting valuable time and working partially blind.

Take care,

Signature