Visual Business Intelligence


	Thanks for taking the time to read my thoughts about Visual Business Intelligence. This blog provides me (and others on occasion) with a venue for ideas and opinions that are either too urgent to wait for a full-blown article or too limited in length, scope, or development to require the larger venue. For a selection of articles, white papers, and books, please visit my library.

2012 Perceptual Edge Dashboard Design Competition: We Have a Winner!

October 15th, 2012

I was pleased and frankly surprised to receive 91 submissions to my dashboard design competition. Surprised because designing a student performance dashboard from scratch based on the data that I provided was not a trivial task. I was especially pleased to find a dramatic improvement over the general quality of entries since the last competition that I judged back in 2006. Almost every entry exhibited qualities that far surpass the dashboards that are typically produced and used today. I’m grateful to everyone who took the time to participate.

This competition served many purposes:

To give dashboard designers an opportunity to test and further hone their skills.
To provide me with many fresh examples of dashboards that were all designed to serve the same audience and purpose—a teacher who needs to regularly monitor the performance and behavior of her students—that I could include in the second edition of my book Information Dashboard Design. I now have a rich and varied set of dashboards that I’ll use to demonstrate effective design and to illustrate common problems that still show up, even in dashboards that are created by experienced designers who take dashboard design seriously.
To showcase examples of exemplar dashboards that could actually be used for an important purpose: to improve educational outcomes.

All three purposes were well served by this rich and varied collection of entries.

Now, let’s get to the winners. Out of the 91 entries, I narrowed the list to the top 8 and scored them using the following criteria:

Each criterion was weighted according to importance, producing a total possible score of 100.

At no time during the judging process was I aware of the competitors’ identities. After scoring the top eight dashboards, to get a final reality check I sent them and a record of the scores to a couple of friends who both support better uses of data in schools. They both concurred with my judgment.

Having finalized and double-checked the selection, I asked for the identities of the competitors. And the winner is Jason Lockwood. His dashboard received the highest score of 90.4 out of 100. This morning when I sent an email to Jason to congratulate him, I learned that he currently works as a usability and design consultant for IMS Health and is based in Switzerland. Although I didn’t recognize Jason’s name, he reminded me that he attended a data visualization course that I taught at IMS Health in London about two year’s ago. Jason originally studied art in Canada. Here’s his winning dashboard.

(Click to enlarge)

One of the first things you probably notice is its fine aesthetics. Its use of color, layout, and reduction of non-data-ink make it pleasing to the eye in a way that enhances usability. Because color has been used sparingly, the red alert icons make it easy to spot the students that are most in need of immediate attention (although the icons could be a little bigger to make them pop more). The tabular (rows and columns) arrangement of student information (one student per row) makes it easy to see everything about a particular student with a quick horizontal scan and easy to compare a particular metric across all students with a quick vertical scan. All of the most important metrics were consistently represented using the same dark shade of blue, which featured them above other items nicely (although the dark blue horizontal bars in the bullet graphs would have been easier to see and compare if they were thicker). This design is scalable in that the addition of more students could be easily accommodated by simply expanding the dashboard vertically. Meaningful patterns in individual student attendance information (days absent and tardy) can be easily seen. Rather than going on with my own description, which I’ll elaborate in the new edition of Information Dashboard Design, I’ll let Jason describe the work in his own words:

1 Introduction

In the course of my work as a UX engineer, I have the chance to try to bring good data visualization practices to my clients. However, many of the “dashboards” requested by those clients are closer to reporting solutions. Seeing this competition, I was delighted to be able to try my hand at a real dashboard. It was a very challenging and satisfying exercise, during which I learned a lot. I have designed this purely as a visual mock-up in Photoshop. I have the great luck of working with some very talented programmers who are incredibly adept of translating my mock-ups to pixel perfect, working solutions, which provides great freedom for me. This usually leads to small inaccuracies in the data portrayal, but I have taken extra care this time to ensure all the representations are accurate.

2 Overall design strategy

There is a lot of information contained within the data sheet, so one of the major challenges would be how to be able to display all of it in a clear way, on a single screen. I felt that all the information was pertinent to the goal of the dashboard, so did not want to exclude anything. That led to the compromise of designing to a slightly higher screen resolution of 1400px width than what perhaps may be standard. However, that being said, I have designed it in a way that on a SXGA monitor, the entirety of the student information would be visible, and the less important, class comparison information would be off screen.

I usually base the overall colour palette on the visual identity of the client. As this was not provided, I invented the idea that the school colours were blue and grey. I would therefore use monochromatic shades of blue for data representation, grey for text and labels. For the background, I am using an off-white with a slight orange tint. This creates a subtle compliment to the blue, making the data stand out a little bit more.

I chose Avenir as a font as it provides a good contrast between upper and lowercase letters for good legibility, as well as very clear numerals. With only a few exceptions (title and legends), I kept a 12pt font size to provide consistency.

3 Student data

Breaking down all the data in the excel sheet was an interesting exercise. First step was to prioritize the information. What would the teacher want/need to see first. I decided that the grades were crucial (that is, after all, the overall measurement of the student’s performance). With the grades I grouped together the other pure assessment information: last 5 year assessments, last 5 assignments. The assignments completed late info provides a nice segue (and visual break) from scores to more behavioural information: Absences/tardies, Disciplinary referrals and detentions.

I sorted the students by their current grade, from worst to best, so the teacher can view the problem cases first. Secondary sort is on difference from current grade from target.

Having ordered the information, the next step was to visualize. The grades lent themselves very well to a bullet chart, efficiently portraying the target, current and previous scores. I used sparklines for the last 5 year assessment scores (being an interval axis), and micro-columns for last 5 assignments. For assignment late count (and later detentions and referrals) I used dots to represent the counts, as I find these are clearer to view than straight numbers.

I chose to try to represent not only the amount of the tardies and absences but also their temporal occurrence. Hopefully this can allow the teacher to identify patterns not just for each student, but for the entire class. This ends up almost like a scatter chart.

Last up for the behavioural data are the detentions and referrals, which again I represent as dots, with past term information in a lighter shade and to the left of the implied axis for comparison.

Once all the student information was portrayed, I decide that some sort of aid was needed to help the user view the information in rows. I decided on zebra striping as I believe, while it is technically more non-data ink than row lines, it is clearer and subtler at the same time (a line has two edges, top and bottom, as does a solid box, but only half as many boxes are required).

To compare the overall class performance to other classes/school/district, I combined the information from the summary tabs to create two graphs: a dot graph to show latest median assessment scores and percentage of students’ assessment scores in percent groups. I chose a dot graph in order to emphasise the variation between the groups, but also to line up with the percentage groups of the second graph.

On the second graph, I have unfortunately had to rotate the category labels. I would normally not do this, but I did not want to reduce the font any more (even reduced to 10pt, it would still be too crowded) or expand the screen any further.

I finally added indicators on the student name to show English proficiency and special ed status, with the legend in the footer, along with the data qualification note.

4 Conclusion

Overall, I am quite pleased with the outcome of this design exercise. I believe I have managed to represent all the information in a clear and well structured way that would fulfill its user’s needs. I have shown this to a couple teachers and received positive feedback (and requests to produce it).

My only concern may be the colours: I design on a Mac and the colour fidelity is very good, however the subtleties sometimes disappear when viewed on less well-calibrated screens. This would be usually something we would fix during implementation, so hopefully it is not too bad here.

Just like Jason, overall I am also quite pleased with this design. The primary improvement that comes to mind is the addition of more information in the right-hand section about the class as a whole, such as a frequency distribution of student achievement on course assignments.

Congratulations to Jason Lockwood for exceptional dashboard design.

In addition to our winner, I’d like to showcase the runner up as well. The entry below was created by Shamik Sharma using Excel 2010.

(Click to enlarge)

Once again, notice the fine visual aesthetics of this design. Also notice the additional class-level information that appears on the right that doesn’t appear in the winning dashboard, especially the two distribution graphs on top, which are quite useful. And finally, notice how the frequency distribution graph of assessment scores in the bottom right corner, which uses lines (called a frequency polygon) is easier to read than the one that uses bars (called a histogram) in the winning solution. A few features in this solution don’t work as well, however, as those in the winning solution. For example, it isn’t as easy to spot the students in need of attention, and per student attendance information is aggregated in a way that hides patterns of change through time. Overall, however, this is excellent work.

I’ll show many more examples of dashboards that were submitted in the new edition of Information Dashboard Design, both to illustrate useful design ideas and a few that didn’t work.

I invite all competitors who are interested in specific feedback about their designs to post them in my discussion forum on this site where I and others may appreciate them and offer suggestions.

Take care,

49 Comments

Here at Last, “The Functional Art”

September 27th, 2012

I rarely agree to write promotional statements for books. As you can imagine, I’m careful to speak words of praise only for books that are exceptionally good and in tune with the principles that I teach. When I was asked by Alberto Cairo to write a promotional statement for his new book The Functional Art, however, I embraced the opportunity with enthusiasm and gratitude. Consequently, if you look at the back cover of this book, the first quote that you’ll see is the following:

If graphic designer Nigel Holmes and data visualizer Edward Tufte had a child, his name would be Alberto Cairo. In The Functional Art, accomplished graphics journalist Cairo injects the chaotic world of infographics with a mature, thoughtful, and scientifically grounded perspective that it sorely needs. With extraordinary grace and clarity, Cairo seamlessly unites infographic form and function in a design philosophy that should endure for generations.

Stephen Few, author of Show Me the Numbers

As you know if you’ve read much of my work—especially what I write in this blog—I rarely have kind words for infographics. This is because there are relatively few infographic designers who know how to inform graphically. Few have developed the skills that are needed. Few have thought deeply enough and for long enough to become experts. Alberto Cairo is a brilliant exception. I suspect that The Functional Art will be the premier work on infographics for many years to come.

To give you an idea of its contents, here’s the list that appears on the back of the book:

Why data visualization should be thought of as “functional art” rather than fine art
How to use color, type, and other graphic tools to make your information graphics more effective, not just better looking
The science of how our brains perceive and remember information
Best practices for creating interactive information graphics
A comprehensive look at the creative process behind successful information graphics
An extensive gallery of inspirational work from the world’s top designers and visual artists

This is a work of great beauty and usefulness combined: the perfect marriage of form and function. All who are interested in infographic design should read this book closely enough to manifest its lessons in their work.

Take care,

4 Comments

Big Data, Big Deal

September 19th, 2012

Data did not suddenly become big. While it is true that a few new sources of data have emerged in recent years and that we generate and collect data in increasing quantities, changes have been incremental—a matter of degree—not a qualitative departure from the past. Essentially, “big data” is a marketing campaign.

Like many terms that have been coined to promote new interest in data-based decision support (business intelligence, business analytics, business performance monitoring, etc.), big data is more hype than substance and it thrives on remaining ill defined. If you perform a quick Web search on the term, all of the top links other than the Wikipedia entry are to business intelligence (BI) vendors. Interest in big data today is a direct result of vendor marketing; it didn’t emerge naturally from the needs of users. Some of the claims about big data are little more than self-serving fantasies that are meant to inspire big revenues for companies that play in this space. Here’s an example from McKinsey Global Institute (MGI):

MGI studied big data in five domains—healthcare in the United States, the public sector in Europe, retail in the United States, and manufacturing and personal-location data globally. Big data can generate value in each. For example, a retailer using big data to the full could increase its operating margin by more than 60 percent. Harnessing big data in the public sector has enormous potential, too. If US healthcare were to use big data creatively and effectively to drive efficiency and quality, the sector could create more than $300 billion in value every year. Two-thirds of that would be in the form of reducing US healthcare expenditure by about 8 percent. In the developed economies of Europe, government administrators could save more than €100 billion ($149 billion) in operational efficiency improvements alone by using big data, not including using big data to reduce fraud and errors and boost the collection of tax revenues. And users of services enabled by personal-location data could capture $600 billion in consumer surplus.

If you’re willing to put your trust in claims such as a 60% increase in operating margin, a $300 billion annual increase in value, an 8% reduction in expenditures, and a $600 billion consumer surplus, don’t embarrass yourself by trying to quantify these benefits after spending millions of dollars on big data technologies. Using data more effectively can indeed lead to great benefits, including those that are measured in monetary terms, but these benefits can’t be predicted in the manner, to the degree, or with the precision that McKinsey suggests.

When I ask representatives of BI vendors what they mean by big data, two characteristics dominate their definitions:

New data sources: These consist primarily of unstructured data sources, such as text-based information related to social media, and new sources of transactional data, such as from sensors.
Increased data volume: Data, data everywhere, in massive quantities.

Collecting data from new sources rarely introduces data of a new nature; it just adds more of the same. For example, even if new types of sensors measure something that we’ve never measured before, a measurement is a measurement—it isn’t a new type of data that requires special handling. What about all of those new sources of unstructured data, such as that generated by social media (Twitter and its cohorts)? Don’t these unstructured sources require new means of data sensemaking? They may require new means of data collection, but rarely new means of data exploration and analysis.

Do new sources of data require new means of visualization? If so, it isn’t obvious. Consider unstructured social networking data. This information must be structured before it can be visualized, and once it’s structured, we can visualize it in familiar ways. Want to know what people are talking about on Twitter? To answer this question, you search for particular words and phrases that you’ve tied to particular topics and you count their occurrences. Once it’s structured in this way, you can visualize it simply, such as by using a bar graph with a bar for each topic sized by the number of occurrences in ranked order from high to low. If you want to know who’s talking to whom in an email system or what’s linked to what on your Web site, you glean those interactions from your email or Web server and count them. Because these interactions are structured as a network of connections (i.e., not a linear or hierarchical arrangement), you can visualize them as a network diagram: an arrangement of nodes and links. Nodes can be sized to indicate popular people or content and links (i.e., lines that connect the nodes) can vary in thickness to show the volume of interactions between particular pairs of nodes. Never used nodes and links to visualize, explore, and make sense of a network of relationships? This might be new to you, but it’s been around for many years and information visualization researchers have studied the hell out of it.

What about exponentially increasing data volumes? Does this have an effect on data visualization? Not significantly. In my 30 years of experience using technology to squeeze meaning and usefulness from data, data volumes have always been big. When wasn’t there more data than we could handle? Although it is true that the volume of data continues to grow at an increasing rate, did it cross some threshold in the last few years that has made it qualitatively different from before? I don’t think so. The ability of technology to adequately store and access data has always remained just a little behind what we’d like to have in capacity and performance. A little more and a little faster have always been on our wish list. While information technology has struggled to catch up, mostly by pumping itself up with steroids, it has lost sight of the objective: to better understand the world—at least one’s little part of it (e.g., one’s business)—so we can make it better. Our current fascination with big data has us looking for better steroids to increase our brawn rather than better skills to develop our brains. In the world of analytics, brawn will only get us so far; it is better thinking that will open the door to greater insight.

Big data is built on the unquestioned premise that more is better. More of the right data can be useful, but more for the sake of more does nothing but complicate our lives. In the words of the 21st Century Information Fluency Project, we live in a time of “infowhelm.” Just because we can generate and collect more and more data doesn’t mean that we should. We certainly shouldn’t until we figure out how to make sense and use of the data we already have. This seems obvious, but almost no attention is being given to building the skills and technologies that help us use data more effectively. As Richards J. Heuer, Jr. argued in the Psychology of Intelligence Analysis (1999), the primary failures of analysis are less due to insufficient data than to flawed thinking. To succeed analytically, we must invest a great deal more of our resources in training people to think effectively and we must equip them with tools that augment cognition. Heuer spent 45 years supporting the work of the CIA. Identifying a potential terrorist plot requires that analysts sift through a lot of data (yes, big data), but more importantly, it relies on their ability to connect the dots. Contrary to Heuer’s emphasis on thinking skills, big data is merely about more, more, more, which will bury most of the organizations that embrace it deeper in shit.

Is there anything new about data today, big or otherwise, that should be leading us to visualize data differently? I was asked to think about this recently when advising a software vendor that’s trying to develop powerful visualization solutions specifically for managing big data. After wracking my brain, I came up with little. Almost everything that we should be doing to support the visual exploration, analysis, and presentation of data today involves better implementations of visualizations, statistical calculations, and data interactions that we’ve known about for years. Even though these features are old news, they still aren’t readily available in most commercial software today; certainly not in ways that work well. Rather than “going to where no one has gone before,” vendors need to do the less glorious work of supporting the basics well and data analysts need to further develop their data sensemaking skills. This effort may not lend itself to an awe-inspiring marketing campaign, but it will produce satisfied customers and revenues will follow.

I’m sure that new sources of data and increasing volumes might require a few new approaches to data visualization, though I suspect that most are minor tweaks rather than significant departures from current approaches. If you can think of any big data problems that visualization should address in new ways, please share them with us. Let’s see if we can identify a few efforts that vendors should support to truly make data more useful.

Take care,

30 Comments

SAS, Don’t Lose Your Way

September 11th, 2012

SAS Institute has been around for a long time. Founded in 1976, SAS (originally an acronym for Statistical Analysis System) became and remains to this day the dominant statistical software vendor. Today, as business intelligence (BI) vendors that know little about statistics are promoting themselves as analytics companies, SAS has taken a wrong turn in its effort to defend its status. This is especially true in regard to statistical graphics. When BI companies show their ignorance of analytics by promoting flashy graphics that look cool but are analytically impoverished, SAS is in a great position to remind the world that statistical graphics are about statistics: meanings derived from quantitative data using proven mathematical methods, which are only valuable to the degree that they are accurate and enlightening. No vendor is in a better position to promote statistical integrity than SAS. So why is SAS taking the low road—one traveled by many BI vendors—to tout its wares?

SAS is now building products that try to compete with the likes of SAP’s Xcelsius, which obscures data behind 3-D effects of light and shadow. I recently spent a day with the bright and thoughtful folks at SAS who developed the visual exploratory data analysis tool named SAS Visual Analytics, and for the life of me I couldn’t fathom why they would undermine their product with flashy nonsense that threatens the integrity and reputation that SAS has worked so hard to build. My concern grew even greater when I was recently told that SAS has on two occasions featured David McCandless of Information Is Beautiful fame as a keynote speaker at major European events.

Photo credit: Tricia Aanderud

Perhaps more than any other proponent of information graphics today, McCandless has lured fledgling practitioners of data visualization to the dark side of sloppy analysis and eye-popping displays that rob information of its clarity, accessibility, accuracy, and meaning.

Effective data visualization is informed by science: statistics and those fields that strive to understand human perception and cognition. There’s much that we can learn from other disciplines as well, including the graphic arts, but only by focusing clearly on the goal: finding, understanding, and communicating the truth that resides in data about things that matter to preserve and improve them. SAS is undermining its own work by promoting impoverished graphics and those who advocate their use. Have the reins of the company been handed to sales and marketing executives who don’t understand or value statistics? I implore my friends at SAS: “Remember who you are—if not for your own sake, for the sake of your customers.”

Take care,

16 Comments

Data Visualization and the Science of Magical Thinking

September 4th, 2012

You might be wondering, “What does this usually rational data visualization practitioner mean by the ‘science of magical thinking’?” The magic that I’m referring to is the age-old practice of magicians—professional conjurers, who use sleight of hand, misdirection, and other methods to astound us. The science that I’m referring to is the work that’s being done, primarily by cognitive scientists and neuroscientists, to learn what we can about the brain (perception and cognition) from the techniques that magicians use to fool us. As it turns out, there is a great deal that magicians can teach us. Most magicians don’t understand what goes on in our brains when they misdirect our attention or manipulate perceptual processes to cause us to see something that isn’t there, but by studying their methods scientifically, we can discern those neurological mechanisms and use them for other purposes.

So how does this relate to data visualization? When we use visual representations of information to explore and make sense of it, we do so within the limits of our perceptual and cognitive abilities. When done properly, visualizations and the techniques that we use to interact with them can help us work around the limitations that are built into our brains, thereby augmenting our natural abilities through the use of technology. One critical aspect of this human-computer interaction involves the management of attention, which is extremely limited and easily distracted. The lessons that we can learn from magicians will help us build better tools for data sensemaking that complement and extend our abilities.

I first became aware of the interest of cognitive scientists in the methods of magicians in 2009 when I visited Ron Rensink at the University of British Columbia in Vancouver, Canada. While there, Ron introduced me to his colleague, Gustav Kuhn, who was visiting from Durham University in the UK. At that time, Kuhn was developing a program at Durham University to study the work of magicians. Since that time I’ve discovered that several other cognitive scientists are also doing similar work.

The best overall introduction to this area of study that I’ve read so far is Sleights of Mind: What the Neuroscience of Magic Reveals About Our Everyday Deceptions, by Drs. Stephen L. Macknick and Susana Martinez-Conde, a husband and wife team of neuroscientists at the Barrow Neurological Institute in Phoenix, Arizona.

Their interest in magic has led them to become amateur magicians themselves and to interact with many of the best magicians in the world today. They coined the term “neuromagic” to describe this specific field of study. In the introduction to their book, Macknick and Martinez-Conde explain how their work led them to the study of magic.

We knew as vision scientists that artists have made important discoveries about the visual system for hundreds of years, and visual neuroscience has gained a great deal of knowledge about the brain by studying their techniques and ideas about perception. It was painters rather than scientists who first worked out the rules of visual perspective and occlusion, in order to make pigments on a flat canvas seem like a beautiful landscape rich in depth. We realized now that magicians were just a different kind of artist: instead of form and color, they manipulated attention and cognition.

If you’re interested in learning more about the world of magic and magicians—who are the best, how they train, the extent of their amazing skills, the underworld of grifters and psychics, and how conjurers feel about their secrets being revealed by that masked magician on TV—as well as the lessons that their methods reveal about perception and cognition, I recommend the book Fooling Houdini: Magicians, Mentalists, Math Geeks & the Hidden Powers of the Mind by Alex Stone. As a magician who trained for years to rise through the conjuring ranks, science writer Stone takes us on the fascinating journey of his own experiences from the streets of New York with con men playing Three-card Monte to the inner circle of the Magic Castle, as well as the laboratories of top brain scientists.

Finally, if you want to peek under the hat, the book Magic in Theory: An Introduction to the Theoretical and Psychological Elements of Conjuring by Peter Lamont and Richard Wiseman, who are both professors of psychology and magicians, does a good job of introducing the methods behind all forms of magic.

If you’re like me, any excuse to read about magic and learn some of the mechanics of wonder will do. We love to be fooled by skilled magicians. Trust me when I assure you that a peek behind the curtain won’t change that. While enjoying the mystical journey, you’ll develop an understanding of visual perception and cognition (especially attention) that will increase your skills in data visualization.

Take care,

1 Comment

2012 Perceptual Edge Dashboard Design Competition: We Have a Winner!

Here at Last, “The Functional Art”

Big Data, Big Deal

SAS, Don’t Lose Your Way

Data Visualization and the Science of Magical Thinking

Archives