Over the past few months, the Obama Administration has worked to apply technology to our nation’s problems and opportunities. I applaud the efforts of our recently appointed Federal CIO, Vivek Kundra, to invest more wisely in technology and to make useful data more available both within government and to the public. While welcoming and encouraging these efforts, it is important that we critique their effectiveness as well and speak up when they could be significantly improved. It is in this spirit of patriotism that I would like to point out flaws in the new Federal IT Dashboard that is currently available in beta release. As someone who has designed a great many dashboards, I can say without reservation that the Federal IT Dashboard is about as useful in its current form as a typical business dashboard, and this isn’t a compliment. Others have written about the Federal IT Dashboard in articles and blogs with nothing but praise. Although it’s tempting to rain nothing but praise on a child who’s performed poorly in the past when he makes an effort to improve, it’s important to supplement that encouragement with instruction as well, if you really care. Kundra states on the website: “We tapped the brightest and most innovative minds from Federal agencies, Congress, independent oversight organizations, and the private sector as we built the IT Dashboard.” The project team apparently failed to tap anyone who has expertise in quantitative data analysis and presentation—data visualization in particular. On the dashboard’s website, Kundra invites suggestions. I think it’s time for us who have the expertise that appears to be lacking in the dashboard’s design to lend a hand.
When we initially access the Federal IT Dashboard, here’s what appears on the site’s home page:
The pie chart and its three companion bars on the right automatically morph every few seconds to display a few measures of a different government agency’s IT projects. Unfortunately for those of us who might actually like the time we spend on this page to produce something useful, neither the slices of the pie nor the segments of the bars are labeled, so we have no idea what we’re seeing. Perhaps the home page was meant to function only as an opening splash page of sorts and we must go elsewhere for actual information.
Let’s select the Investments tab at the top and hope for something useful.
Aha! Here we see the pie chart and bars from before, but this time the parts are labeled. Now we’re getting somewhere. Well, actually, we’re not getting anywhere without a great deal of unnecessary effort. Why are the charts three dimensional? Despite their unfortunate popularity, three dimensional displays of two-dimensional data are not only superfluous, they also undermine the simple task of graph perception and comprehension. As Edward Tufte would say, this is “chartjunk.” It breaks one of the basic rules of data presentation: “Do no harm.”
Those of us with expertise in quantitative data displays almost unanimously despise pie charts. The one thing they have going for them is the clear message that they’re displaying parts of a whole. It would help, however, if we could actually compare those parts by comparing the slices of the pie, but visual perception isn’t tuned to compare areas effectively. It is, however, highly tuned to compare the lengths of bars. Had the percentages of the projects that fall into the three categories of “normal,” “needs attention,” or “significant concerns” (see the legend at the bottom) been displayed as three separate bars with a common starting position and labels to the left, rather than pie charts, we could have easily compared these percentages. As it is, to make sense of the pie chart we must keep referring to the legend and then read the numbers that appear next to each slice, because the pie doesn’t do the job on its own.
We’re faced with a similar problem when we try to use the three stacked bars to understand “project costs,” “schedules,” and “CIO evaluations,” because we can’t effectively compare segments of a bar arranged end to end. Three separate horizontal bars for each set of measures (for example, “Costs”) arranged one above the other with a common starting point, on the other hand, would be easy to compare.
Even if the information were displayed using appropriate graphs, it would still be of little use because we derive meaning from quantitative information primarily through comparisons, but for any of these measures we can only compare values related to the three qualitative states of projects—”normal,” “needs attention,” and “significant concerns”. At any one moment we can only see either all agencies combined or a single agency, but never multiple individual agencies which prevents us from comparing them, and we can only see one point in time, which prevents us from comparing what’s going on now to the past to observe how things have changed.
If we wish to compare service groups and agencies, however, we can move to another page, which displays IT projects in the form of a treemap.
Using this treemap, we can roughly compare projects among different service groups by using the sizes of rectangles to compare one measure (total IT spending in this example) and the colors of rectangles to compare a second (% change in IT spending in this example). If the treemap were better designed, we could now get a fairly good overview of how projects among service groups compare, but a couple of problems make it tough going. In the treemap above, projects are organized into four service groups: “Services for Citizens,” “Management of Government Resources,” “Service Types and Components,” and a truncated category that begins with “Support Delive…”). Unfortunately, if we want to identify individual projects in these categories, we must hover with the mouse over each in turn to get the name to appear in a tooltip window.
If we drill down into a particular service group by clicking it, we can see projects in that service group organized by agencies (“Defense and National Security,” “Health,” etc.).
Based on this view, however, can you actually see the boundaries that separate one agency from another? For some reason, the borders that separate them have become partly obscured. Eventually we can drill down to a level in the hierarchy where a treemap is no longer the best way to view the projects because the number of them could be more easily compared using one or more bar graphs, but this option isn’t available. And finally, when we’ve drilled down to the lowest level—a single project—the treemap view is entirely useless, as you can see below. The unlabeled big gray rectangle tells us only that spending on this project—whatever it is—didn’t change much since the previous year. Perhaps it didn’t even exist in the previous year.
Below the treemap in the bottom left corner we have the ability to change the colors that are currently being used to display percentage change in IT spending ranging from -10% (blue) to +10% (yellow). This ability is useful for ad hoc data analysis when flexibility is needed to respond to unanticipated conditions , but on an analytical application like this, which has been designed to display a specific set of measures for a specific set of purposes, it would make more sense to select a color ramp that works well and resist complicating the dashboard with choices that aren’t necessary.
If we wish to see how spending on federal IT projects has changed over the years, we can proceed to the Analysis section of the dashboard and select Trends. The first of two displays that are available for viewing time-series data is an animated bubble chart, which attempts to use the method popularized by Hans Rosling of www.gapminder.org.
The strength of this approach is when it’s used to tell a story. When Rosling narrates what’s happening in the chart as the bubbles move around and change in value, pointing to what he wants us to see, the information comes alive. Animated bubble charts, however, as much less effective for exploring and making sense of data on our own. I doubt that Rosling uses this method to discover the stories, but only to tell them once they’re known. We can’t attend more than one bubble at once as they’re moving around, so we’re forced to run the animation over and over to try to get a sense of what’s going on. We can add trails to selected bubbles, which make it possible to review the full path these bubble have taken, but if trails are used for more than a few bubbles the chart will quickly become too cluttered. Essentially, what I’m pointing out is that this is not the best way to display this information for exploration and analysis. A simpler display such as one or more line graphs would do the job more effectively. Perhaps you’re concerned that a line graph couldn’t display two quantitative variables at once, such as “Total IT Spending” and “Percent Change in IT Spending,” which appear in this bubble chart. Assuming that two quantitative variables ought to be compared as they change through time, two line graphs—one for each variable—arranged one above the other, would handle this effectively. One of the fundamental problems with the bubble chart above, however, is that the two quantitative variables that appear in it really don’t need to be seen together. There is no correlation between total IT spending and percentage change in IT spending from year to year, so there’s no reason to complicate the display by viewing them together.
Even if this animated bubble chart were a good visualization choice in this case, several problems in its design would undermine its usefulness. When I first look at it, I was puzzled for awhile about what “03. % Change in IT Spending” meant. I couldn’t understand the significance of “03. %…” It took awhile to figure out that each variable that appears on the graph was numbered, beginning with “01.” and ending with “05.”, which was completely meaningless and confusing.
Unlike the intuitive use of colors to that we saw in the treemap, the rainbow of colors that appear in the bubble chart are ineffective. The order of the various hues as they change from red to blue is not intuitive. Take these colors and ask people to put them in order from high to low and you’ll get a variety of answers.
Also, the ability to switch the quantitative scales from linear to logarithmic certainly makes sense to people who have been trained in statistics, but is confusing to most of the folks who would use this dashboard. For this reason, I believe this feature should be removed. While it is appropriate to include such functionality in a general purpose data analysis tool, custom analytical applications like the Federal CIO Dashboard should eliminate features that aren’t commonly useful and are potentially confusing in an effort to keep the application simple. Even those who understand how to use a log scale don’t need it available on this dashboard, because few of them would be satisfied using this bubble chart, but would rather download the data and explore it using a better analytical tool.
For those who recognize the limitations and flaws of the bubble chart, an alternative in the form of a bar graph is available. For our entertainment pleasure, when switching between the two, the bubbles morph into bars before our eyes and line themselves up along the horizontal axis.
The bar chart version is just plain silly. None of the bars are labeled until you click on them one at a time to make labels such as “Education (Dept of)” and “Homeland Security (Dept of)” appear. Knowing only the identity of the selected bars (the others remain unlabeled) and watching the bars move around as spending changes through time is eye-catching but almost totally meaningless. Once again, simple line graphs for comparing changing values for the selected items would do the job much better.
Because I wanted to learn something more useful about federal IT spending, I decided to take advantage of the data feeds that are provided, but once again ran into a wall. Unfortunately, the information that can be downloaded is limited to a current year’s snapshot, which includes three variables—total spending, new/upgrades spending, and maintenance spending—broken into three time-based categories: last year’s actual spending, the current year’s enacted spending, and next year’s budgeted spending. Time series aren’t available nor is there a way to compare actual to plan. In other words, the comparisons that I would have found most meaningful couldn’t be made based on the information that’s available.
I want to encourage Vivek Kundra to complement his fine intentions with more effective designs. There’s no need to duplicate the mistakes that most businesses still make when working with information. Data analysis and presentation best practices are not a mystery and aren’t difficult to learn. Several of us who know and care about this are available to help. I suspect that others would be willing, as I am, to assist free of charge. America can do better than this. We have a great opportunity to use information technology to make the world a better place. Let’s not miss it.