Visual Business Intelligence – The Billion Pound-O-Gram Redesigned

The Billion Pound-O-Gram Redesigned

I was recently asked my opinion of David McCandless’ chart “The Billion Pound-O-Gram,” pictured below.

The person who asked the question was impressed with this chart the first time he saw it. For this reason, he thought that I might find it an effective exception to McCandless’ other work. I do not.

This chart originally appeared in the Guardian on November 7, 2009. It was framed by the following explanation:

Huge sums of money are being bandied about and no-one knows what they are. It’s time to put them into perspective.

289 billion spent on this. 400 billion spent on that. When money reaches this level it literally becomes mind-boggling.

Yet these figures are regularly issued by the government – and the media – as if they are self-evident facts that everyone understands.

Frustrated by this, I created The Billion Pound-O-Gram.

I’ve mixed up of 2008/09 figures from the Treasury and the Guardian. Visualising the numbers like this puts them in visual context, making them easier to relate to.

I was pretty shocked by the size of the UK budget deficit – essentially the country’s overdraft. It’s more than an entire year’s worth of income tax.

A second chart appeared following this explanation to illustrate how the “top five ideas for plugging the deficit” offered by political parties in the UK might reduce it. To sum up, the story is that the UK’s budget deficit is really big, which McCandless tells by comparing it with other amounts of money that are apparently familiar to Guardian readers. The question that concerns us here is, “Does this chart tell the story effectively?” Does it put the budget deficit into perspective in a way that doesn’t boggle the mind as McCandless suggests? This is journalism, so the objective is to inform, to help readers clearly understand the size of the deficit.

By using rectangles of varying sizes arranged as a treemap of sorts, McCandless forces us to perform a perceptual task that we can’t do well (that is, area comparisons). This is a bad choice when he could have used a bar graph instead and allowed us to compare the lengths of bars that share a common baseline, which we can do exceptionally well. Furthermore, his arrangement of the rectangles is arbitrary—not based on category or on the sizes of values—which compounds the difficulty.

To better understand what I’m saying, try to answer the following questions without reading the numbers that appear in the rectangles:

Which represents a larger amount: Mortgage Lending 2007 or NHS?
How much greater is Mortgage Lending 2007 than State Pensions?
Does State Pensions compared with Tesco Revenue look like the difference between 62 and 59, or much greater?
Which is bigger: Income Support or Police?

You might argue that these comparisons aren’t critical to the story, which is primarily about the budget deficit. If we concern ourselves only with comparing the deficit with other values, nothing about the chart’s design makes this easy, even when items are adjacent to one another. For instance, try answering the following questions:

How much greater is the deficit than Africa’s entire debt to Western nations, which appears immediately below it? (And don’t cheat by reading the numbers.)
How much greater is “Bailout: Asset Purchasing and Lending” than the deficit?
How does Income Tax compare to the deficit?

Without reading the numbers, you’re forced to make wild guesses, which are considerably different from the truth, which could have been presented clearly.

All of these comparisons are incredibly simple to make using the bar graph below. Take a minute to notice how easy it is to see the relationships between these values from largest to smallest and to compare them. Notice especially how easy it is to compare each of the values with the budget deficit, which appears as the vertical black reference line.

Click to enlarge.

In the bar graph, I stuck with the colors that McCandless chose to make it easy to compare his chart with mine, except that I tweaked a few colors a bit to resolve minor problems. In McCandless’ chart, some colors stand out more than others, but they should be equal in salience unless there’s a reason to feature some items over others. Also, for some unknown reason McCandless sometimes altered a single color from rectangle to rectangle, which serves no purposes and creates potential confusion. For example, notice that some of the green rectangles are lighter than others, yet they all represent “Earning.”

I can’t imagine anyone seriously arguing that McCandless’ chart communicates this information as well as the alternative above, but is his chart more engaging? Some folks might find it more engaging purely on the level of entertainment, but not in a way that encourages or supports meaningful consideration of the information, resulting in optimal understanding. Journalism should tell the story truthfully and clearly.

Take care,

Wednesday, June 29th, 2011 at 10:03 am

31 Comments on “The Billion Pound-O-Gram Redesigned”

By Chip Lynch. June 29th, 2011 at 11:21 am

Hi Steve,

I’ve been following your blog for a few months now (since I first saw it), and I thoroughly enjoy it, have learned some things from it, and by and large agree with you about the problems with information delivery and visualization.

So I hope you don’t take any offense that I’m not completely on page with you in this instance. :-)

I actually like the Guardian’s chart and in many ways prefer it to the bar charts. The first reason you’ve admitted — it’s more engaging on a purely visual level, however I feel that you undervalue that aspect. Journalism, I believe, should have a portion of responsibility to engage, and while that should never trump truth and clarity (else you get pandering or outright lying), I think it’s important. The Guardian’s chart certainly contains no mis-truths, and I think its clarity is more open for debate.

As a first example, the color use in the Guardian’s chart is near 100% coverage of the overall page. While it may be true that comparing areas for nearly-similar values is difficult, it’s made easier by this large coverage area. The bar charts, in comparison, take up a very small portion of the page.

That becomes vitally important in the colors of smaller values. For example, I cannot tell in your bar chart the difference between the “Roman Abramovich” and “Aid” bar’s colors nearly as readily as I can in the original.

Also, the larger color coverage allows me to more quickly search for colors. I can identify very quickly which boxes are “Fighting” and which are “Giving”, but a scan of the bar chart does not make that as easy.

Nextly, McCandless’ chart includes some nesting (Tesco Profits, Tax Fraud, NHS IT spending), which is completely lost in the bar chart — I’ll not argue on whether it’s necessary, but it’s certainly information and is very clear from the nested boxes.

Last, back to the area vs. length argument, you disapprove of reading the numbers, but they’re available, completely legible, and of course alleviate any confusion to the underlying values. It’s unfair to discredit the chart asking us to not “cheat” by using a feature that it actively and effectively employs, particularly when you do the same on the bar chart.

Also, something I just noticed, while you assert that we’re not perceptually good at comparing areas, we may be better at aggregating them. For example, is Bailing a larger chunk of the budget than Spending, or Giving more than Earning? I don’t think I can clearly answer that from either chart, but I think the area chart gives a slightly better chance. (This idea alone has had me staring at the charts for some minutes — I’d be interested in any real tests on that).

Not trying to be overly critical or negative — I was mostly just taking the challenge you proposed in the last paragraph. I completely agree that the dark budget box is out of place, and the bar chart is far better at showing how it aligns. I also agree that the treemap’s random arrangement of boxes is distracting, and the nicely sorted bar lengths are visually appealing and certainly make comparing two bars trivial.

Still, it’s a tough sell, to me, that the bars are a complete win, and particularly compared to some charts such as the ridiculous GE ones you pointed out last week, I think the Guardian’s treemap, while flawed, is reasonably good.

By Fesh. June 29th, 2011 at 12:39 pm

@Chip

The argument against displaying the raw data is extremely valid. If in my visualization I have to put down the raw numbers, why do I need the visualization? Steve’s version justifies the visualization because, without the numbers I’m not left thinking…hmm, how close is “Income from National Insurance” to “Africa’s National Debt”? Without the numbers (on McCandless’s) you’d think it’s more since the block is the same height but wider. But take the raw numbers of Steve’s and you can clearly see that Income is less than Africa. McCandless’s visualization leans on those numbers, Steve’s puts them there as additional information.

You definitely make a good point about categories not being comprable (“Bailing” vs. “Spending, “Giving” vs. “Earning”) which both methods fail to do. However, you have to remember a utility of a treemap (which you pointed out): nesting. But by randomly scattering categories as the best fit, McCandless fails to properly use the nesting. Why not just have a large “Bailing” square which contains all of the Bailing information and break that down? Who knows, but he doesn’t. Furthermore, the visualizations, if my crude math is correct, are out of proportion. If “Bailout: Asset Purchase & Lending” is 400 and “Bailout: Cash” is 200, wouldn’t it stand to reason that the areas should be 2:1 (Asset:Cash)? Well using some crude measurements, the Cash square seems to be a little over 10% too small to accurately relate. This also is exemplified by the example in my previous paragraph. Despite the fact that “Income from National Insurance” is 28 pounds less than “Africa’s National Debt”, the area of Income is greater.

I think the bigger argument here is that McCandless does things wrong for the sake of art. That’s fine, if he wants to represent himself as producing art projects. But he’s not. He’s giving people information as statistics and his visualizations of these facts are lies. That’s a big problem.

By Andrew Rickard. June 29th, 2011 at 12:43 pm

I wonder how the information would be received had you designed a bar chart showing the differences in pounds between the U.K. budget deficit and each of the other “Big Amounts of Money”. I wonder if it would have communicated the information equally or more effectively then your bar chart?

Or perhaps instead as the difference in pounds it could have been expressed as a percentage difference from the U.K budget deficit. Expressed this way you wouldn’t need to indicate that the values were in (Billions of Pounds)

Anyway food for thought…

By Andrew. June 29th, 2011 at 12:43 pm

I’m sorry, I realize we’re entitled to different opinions, but I’m having a hard time understanding why McCandless’ display could ever be considered “more engaging”. Personally, I find the bar chart much more engaging: in less than a second, I can answer so many questions, and even discover new questions I hadn’t thought of.

> The Guardian’s chart certainly contains no mis-truths…

I disagree. It’s immediately clear that many of the areas in McCandless’ chart are not proportionate. You suggested “Roman Abramovich” and “Aid” as being easier to compare in the area chart – the values indicate 7:5, the bars appear to correlate, but the original areas roughly indicate 2:1. That’s a pretty noticeable inaccuracy. While the numbers aren’t lying, the visualization is definitely misleading. Why do you suggest the areas are easier to compare when they are dead wrong?

> I can identify very quickly which boxes are “Fighting” and which are “Giving”, but a scan of the bar chart does not make that as easy.

With the bar chart, you can just scan the list in one direction (vertically). Why is it easier for you to scan in two directions (vertically and horizontally) with the area chart? At the very least, shouldn’t scanning the bar chart be just as easy, if not easier?

> Last, back to the area vs. length argument, you disapprove of reading the numbers…

I think he was merely demonstrating that if we _must_ look at the numbers in order to compare, then the visualization no longer serves a purpose. In McCandless’ chart, I have to perform the comparison myself in many cases; I have to find the numbers and mentally sort them. With the bar chart, I can make comparisons easily without numbers. I think Stephen talks about areas vs. lengths in all of his books, but I’d strongly recommend reading Chapter 4 of “Information Dashboard Design”; it contrasts all of the potential methods of comparison (including areas and lengths), and explains how and why some work better than others (and when to use them).

By santiago. June 29th, 2011 at 1:28 pm

Stephen, perhaps the tests you propose aren’t adequate or maybe incomplete, in the sense that it could exist some tasks that are easier to perform with the 2D display. I recognize at least one thing that’s very easy to do in the McCandless surface and very hard in the bar graph: to visually ‘select’ one rectangle and then look to others and compare surfaces many times (back and forth). Try this: focus on Defense Budget and compare it with 10 other values no matter where they are placed. It’s always easy to return to the departure point. If you try the same task with the bar graph you’ll see it’s very likely to get lost, and the effort to return to the starting point is hard. Individually each comparison may be less accurate in the surface map, but you can perform much more comparisons.

In general terms I believe the treemap (actually is just a quadrification) enables very diverse comparatives. The eye goes around selecting different pairs of surfaces or even blocks (some are clearly suggested by the author by color or composition means). The bar graph forces a linear lecture (thus proposing only consecutive comparatives).

Pleasure lies on this non-linearity reading or ‘perceptual freedom’ and I believe is part of the success of this graph. And by success I not only mean the fact that a lot of people shown interest on it, but also that a lot of people read it and had an insightful experience.

By Chip Lynch. June 29th, 2011 at 2:04 pm

@Fesh and Andrew:
> You suggested “Roman Abramovich” and “Aid” as being easier to compare in the area chart

I originally said that the _colors_ are easier to compare. I had trouble differentiating the colors in the bar charts; the areas, I agree, are easier to compare in the bar chart, absolutely (since it is sorted).

> It’s immediately clear that many of the areas in McCandless’ chart are not proportionate.

I’ll take the hit on that one. I mistakenly assumed that the areas were in fact proportionate, without actually doing the calculations. If the areas are wrong, they’re wrong, and that’s unforgiveable, agreed. If this were a proper treemap, though, and properly calculated, then I think my comments stand (back to that in a moment).

> Why is it easier for you to scan in two directions (vertically and horizontally) with the area chart?

I’m saying that it seems easier to find colors to me. I’m guessing it’s because they take up so much more space; the number of pixels alone make colors easier to find.

> I think he was merely demonstrating that if we _must_ look at the numbers in order to compare, then the visualization no longer serves a purpose.

I still have a lot of thoughts on this. I think there’s a lot of overstatement about our ability to perceive the differences in areas. For the most part, we’re not bad at it. If the bar chart were sorted somehow else — alphabetically for example, how much better would it be? We’ve sacrificed colored pixels, nesting, and the option (poorly executed here, admittedly) for better clustering in the treemap, for perfect ordering in the bar chart. Remember that, for most comparisons, the area chart works fine. It’s easy to find the few largest, few smallest, and you can probably do a good job of sorting the whole lot visually.

The addition of the printed pound amounts fixes the shortcoming WITHOUT sacrificing anything. It’s easy to make many comparisons with out them, so the visualization is still worthwhile, particularly for the colored categories.

Also, I think the bar chart is very bad at some sort of questions… for example, “what portion of the whole is NHS, or Giving” rather than just comparing between two elements. The bar chart is very bad at this. Now, another confession — I didn’t realize this wasn’t a true subdivided map when I first saw it, so that question is less useful. Still, I find that aggregating areas seems easier… I think a reasonable argument can be made that the tradeoffs are valid.

Which brings me to…
> I can answer so many questions, and even discover new questions I hadn’t thought of.

See, I think this time I actually disagree. There’s nothing the bar chart can do that the area chart can’t. It can do some things better (relative values), but how much better is debatable (and not better at all if you count the printed amounts). It simply can’t do some things (nesting). So I don’t know how it can spawn more questions than the area chart.

> I’m having a hard time understanding why McCandless’ display could ever be considered “more engaging”

This is completely subjective so I understand disagreement, but I think it’s clear that it at least “could” be more engaging. I find that a list of text elements is less fun to search through than the scattered area chart. That’s completely qualitative, I realize (let’s not argue if “fun” is good — there’s no business need here, I’m talking just about engaging readers). For one thing, not being too familiar with the subject space, I’m not looking for anything specific, so browsing with my eyes wandering around is genuinely more pleasing than simply reading down a list. As an example, take a look at this (which people here have probably seen):

http://www.wallstats.com/deathandtaxes/

It’s not going to rate very highly as a visualization by the quantifiable terms that are often discussed here, and I agree, but it’s really enjoyable to read through, to find new pockets of spending and how they’re distributed. If this were a straight list I’d go mad reading it all, it would be utterly boring. I think the area chart is closer to this example in terms of engagement than is the bar chart.

I do like the thought provoking posts and comments, though. I’m glad for the discussion.

By Stephen Few. June 29th, 2011 at 2:06 pm

Chip,

Thanks so much for the thoughtful response. I appreciate it when people challenge what I say in substantial and reasonable ways, which describes your comments well. I’ll try to address your points one at a time.

It is true that this particular example of McCandless’ work is better than most, but only because little is required of a chart to make his point, which is that the budget deficit is really big. His chart manages to say this, but he could have said this better. I think the bar should be set higher (pun intended).

You pose the challenge: “While it may be true that comparing areas for nearly-similar values is difficult, it’s made easier by this large coverage area.” Actually, your claim isn’t accurate. The fact that these rectangles are relatively large makes them easier to see but not easier to compare. Also, it is not only true that comparing nearly similar areas is difficult; comparisons of all areas is difficult. For instance, a rectangle that is precisely half the size of another rectangle will not appear half the size to our senses if it they differ from one another in both height and width (that is, in two dimensions). A bar that is half the size of another (varying in one dimension only), however, we can perceive to a fair degree of precision.

Your next claim regarding size, however, is spot on. You correctly point out that larger areas of color are easier to discriminate than smaller areas. Using the full-sized version of the bar graph, however, the bars are not so small that the colors cannot be discriminated. Neither chart works well in the small version. In fact, the small version of McCandless’ chart is completely unreadable.

Furthermore on the topic of color you say: “The larger color coverage allows me to more quickly search for colors. I can identify very quickly which boxes are ‘Fighting’ and which are ‘Giving’, but a scan of the bar chart does not make that as easy.” It is true that larger areas of color are easier to spot than smaller areas, all else being equal. Neither chart is designed to help us quickly find items in a particular category. The assumption is that this isn’t useful, given the purpose of the chart. If it is useful, then rather than trying to get the main chart to support this purpose, which it could never do effectively, a second chart specifically designed to support this use would work best. A common mistake that people make when telling stories with charts is to attempt telling the entire story with a single chart. Finding items in the same category would be supported well with a bar graph that groups bars of the same category together.

I didn’t include nested values, such as the bailouts of the two banks that account for 56 and 48 billion pounds respectively of the 289 billion pound total, only because this didn’t seem to add anything useful to the story. I could have easily shown these parts of the total bailout of banks by including them as segments of the bank bailout bar (i.e., as a stacked bar). This would enabled comparisons to the deficit more effectively than the rectangles in McCandless’ chart.

Here’s your next point: “Back to the area vs. length argument, you disapprove of reading the numbers, but they’re available, completely legible, and of course alleviate any confusion to the underlying values. It’s unfair to discredit the chart asking us to not ‘cheat’ by using a feature that it actively and effectively employs, particularly when you do the same on the bar chart.” I included the numbers on the bar graph only because McCandless included them, based on the assumption that precise values were useful. The difference between McCandless’ chart and mine is that mine would work without the numbers, making comparisons of the values possible. It is fair to point it out when a chart includes numbers as text to make up for the fact that values cannot be effectively compared without them. Having to read numbers to compare the values adds time and effort to the process that shouldn’t be necessary with a graphical quantitative display.

Lastly, you make the point: “While you assert that we’re not perceptually good at comparing areas, we may be better at aggregating them. For example, is Bailing a larger chunk of the budget than Spending, or Giving more than Earning? I don’t think I can clearly answer that from either chart, but I think the area chart gives a slightly better chance.” What you have discovered on your own is the one thing that research has demonstrated as an advantage of a pie chart over a bar chart for part-to-whole displays. It is easier to compare the sum of a set of slices to the sum of another set of slices in a pie chart than it is to compare the summed lengths of bars in a bar graph. While this is true of pie charts, as one form of area comparison, summing the sizes of the rectangles in McCandless’ chart is more difficult. If the sums of things ought to be compared, the best way to present them is to pre-aggregate them and then show the results in a bar graph. For example, if it would be useful for people to compare the sum of these amounts per category (bailing, spending, etc.), the following graph would support this effectively.

Redesign of Billion Pound-O-Gram by Category

Something that I didn’t mention in my critique of McCandless’ chart is that it suggests something about the data that isn’t true. Treemaps, like pie charts, are part-to-whole displays. When I first looked at The Billion Pound-O-Gram on McCandless’ website, I assumed for quite a while that the values were parts of some whole, such as total UK government spending. The reason is because I assumed he used a treemap properly. People would be less likely to make this mistaken assumption when looking at a bar graph.

By Stephen Few. June 29th, 2011 at 2:37 pm

Santiago,

The linearity of the bar graph offers a significant advantage. Scanning and comparing items in one direction is more efficient and easier than two—the opposite of what you seem to assert. Using your example, comparing the Defense Budget to 10 other values no matter where they are can be done much more efficiently and with greater ease using the bar graph. The starting point—the Defense Budget—appears right next to each bar. You don’t ever need to return to the starting point, because you’re always there. The arrangement of values from highest to lowest serves three useful purposes: (1) you can easily see the ordinal relationship between all the values, (2) you can easily see where the budget deficit fits relative to the other values, and (3) you can more easily compare values when they’re arranged in this fashion.

Contrary to your claim, nothing about McCandless’ chart clearly suggests particular comparisons, other than in those cases where rectangles are contained within larger rectangles, which encourages comparisons between the smaller rectangles and between them and the larger rectangle, but these are not particularly useful for telling this story. Our eyes bounce around McCandless’ chart without being led in a meaningful way. If you enjoy having your eyes jump around randomly, then you’ll find this chart entertaining, but not particularly informative.

By Richard Taylor. June 29th, 2011 at 10:45 pm

I have to confess that when I looked at the McCandless chart, my first impression is that the Budget deficit does not look all that big compared to some of the other big boxes on the chart and particularly it is not that big when compared to all the other boxes together, so maybe the Budget Deficit is not such a problem after all. I have not read the article so I do not know if that is the impression that it is supposed to convey, however I suspect that it is not.

My second thought arrives a few seconds later, what do these numbers mean? and what does it mean to compare them? The chart compares assets with spending and then mixes up years to further confuse matters. There is Mortgage lending for 2007 at the height of the boom and mortgage lending in 2008 at a much reduced level, but why are they on the same chart? Then there is Trident which is spending over a 25 or 30 year period, which does not compute when compared with spending in a year.

A treemap is used to compare parts to a whole when there are many parts. Putting these unrelated numbers into a treemap like structure suggests that they are parts of a whole which they are certainly not.

By Bruce Mitchell. June 30th, 2011 at 6:26 am

Stephen,

I first saw this graphic on “the Joy of Stats” fronted by Hans Rosling. The graphic was shown briefly, and the camera was ducking and diving all over it. You only saw the whole thing for a moment, which made it impossible to assess. The camerawork made me suspect that the graph wouldn’t stand up to scrutiny: I was uneasy with it yet it seemed a reasonable idea, but only – as you say – as long it used a rigorous and correctly applied treemap methodology of parts to some whole.

It clearly does not.

Alongside all the flaws you have explicitly pointed up is another you have alluded to which throws the graphic straight onto the pile of high-school wannabe rejects.

The whole point of the graphic is the comparison of relative area of rectangles.

So it is critical that these are calculated correctly. If they are not, the graphic is worthless.

Take as your base unit “Income from National Insurance(NI)”, which is stated to be £100bn. On my monitor and resolution this comes out as 65mm x 26 mm (1,690 mm2).

To take just two examples, VAT (£80bn.) is given 962 mm2, while it should be 1,352, so underrepresented by 39%. “Africa’s entire debt …” (128bn) gets 1,300 mm2 when it should be 2,163 mm2, understated by 40%. Of course this variable actually does not belong in the graphic at all.

Some, perhaps even the majoriy of the remainder may be correct. But that is not the point. They ALL have to be correct to display the correct relationships.

All in all, it’s a truly awful graphic and deserves to be shown up.

Bruce Mitchell

By David Leppik. June 30th, 2011 at 8:14 am

I’m a long time Few fan, but also a long time McCandless fan, and I think I’ve finally figured out why that’s not an oxymoron.

The bar graph does a great job of allowing a motivated individual to compare things quickly and deeply. But it looks like every other bar graph in the world. There’s no context integrated into it. So it fails as an infographic.

McCandless’s area chart has a visceral quality that is totally lacking in the bar graph. Those skinny lines don’t carry much weight. They don’t feel big. I look at it and I think: that bailout is *big*. I imagine that chart painted on the wall of an art museum, and people walking away with an emotional sense of the heft of the bailout. If I wanted to give someone a sense of how big the bailout is, I’d use the area chart.

The bar chart provides no sense of proportion. That is, they have no gravitas. They’re the same type of bars used to describe the market share of fava bean suppliers in Iceland. In contrast, big, bold squares and the way they fill the page suggest that the quantities involved are huge. People have a hard time telling the difference between millions and billions. This type of chart does it.

Point being, only the area chart is a good infographic. It draws otherwise unmotivated people into the graph, and it provides context through its visual heft. The line chart, while superior in all the ways Few describes, fails as an infographic because it provides no such context.

By Jamie. June 30th, 2011 at 8:37 am

@ David

I am having a hard time seeing how you reached any of the conclusions that you state.

“That is, they have no gravitas. They’re the same type of bars used to describe the market share of fava bean suppliers in Iceland.”

And tree maps are used for the same type of very pedestrian purposes as well. I don’t see how a tree map provides more “gravitas” than a bar chart.

“People have a hard time telling the difference between millions and billions. This type of chart does it.”

How on earth does it do that?

“Point being, only the area chart is a good infographic. It draws otherwise unmotivated people into the graph, and it provides context through its visual heft. The line chart, while superior in all the ways Few describes, fails as an infographic because it provides no such context.”

Again…how?
What “context” does the poorly implemented tree map provide?
Visual Heft?
If that’s the case, the argument easily turns to who can make a larger more visually salient graphic?
Which is just silly.

While I agree that a large block of colors on a page will attract an uninterested person by simple weight of visual prominence, I certainly don’t agree that such uninterested people will suddenly become interested in reading all of the values and trying (in vain) to make sense of the sizes and colors and values.

A tree map that does not follow the good practices of a tree map, and is difficult to properly interpret is not what I would ever call a good info graphic.

By Kyle Hailey. June 30th, 2011 at 8:46 am

What is the goal of the original diagram? Is it to engage the viewer into exploring the information or is it to compare minutia. If the goal is to compare similar values then it fails but if the goal engage the most people into exploring the information it succeeds wildly.

The original diagram is like story that invites people in for fun ride and educational ride where as the simple bar chart is not as visual compelling and reaches fewer people.

@Chip: kudos for trying to break down why the original might be more visually compelling. It’s much harder to analyze why something is visually compelling then to check if it’s measurements are correct and if it follows well known data display conventions.

By Stephen Few. June 30th, 2011 at 9:26 am

What is clear from this discussion and many others that I’ve had in the past is that we don’t all share the same expectations from “data visualization.” Some believe that the goal is mostly to get people to look at the graphic and if people get any information from it at all, the goal has been met as long as they enjoyed the process. I and many others, including the most talented journalist infographic designers in the world, believe that the goal is to inform as well as possible. When did the primary goal of journalism morph into providing entertainment? What does this say about culture at large? How does this support or diminish our opportunities to use data more effectively for decision making in the “information age”? These questions concern me. I want something more from information visualization, especially its use in journalism, than fun and games with a smidgen of information if you’re lucky.

By DR. June 30th, 2011 at 10:12 am

Your last comment is far and away the most useful part of this conversation. Your bar chart informs, but is not interesting. McCandless’ chart enlightens, but is not strictly correct.

But again, your comment regarding “entertainment vs. information” is more interesting. Case in point, right now I have:
– 3 computer screens in front of me, one of which is constantly updating with live financial data (US Markets seem stronger than they should be – bulls are out in force today.)
– A television on mute tuned to woman’s world cup soccer. (France is up 1-0 on Canada, but I’m looking forward to the US vs. Colombia on Saturday.)
– A radio behind me tuned to NPR (Greece’s austerity measures would piss me off too, but I just don’t see an alternative – and the recent vote gave a hopeful bump to the EURO.)

This is what you are competing with. Your bar chart, while a better way to encode this information, absolutely will not give me pause in this environment. If you want to tell me a story, you best find a way to convince me to give you some attention first. It’s not about entertainment, it’s about breaking through the noise we’re all working with today.

I’m not going to argue in favor of the way McCandless encoded the information – as you and others have pointed out, his chart just isn’t technically very good. However, something made me take the time to look at it. I looked at it the first time I saw it, and it was the reason I looked at your post here. And both times I walked away with a bit of new knowledge.

The New York Times of course does this better. They capture the imagination, give me pause, AND deeply inform. Clearly that is the target. But I think you miss the deeper issue if you propose that your bar chart is the answer.

By Stephen Few. June 30th, 2011 at 10:46 am

DR,

As you can see in this discussion, your opinion that the bar graph version of this data “is not interesting” is not universally shared. Also, you must define “enlightening” differently than I do. To say that the bar graph informs but McCandless’ chart enlightens is logically flawed. To enlighten, it must first inform.

The primary goal of data visualization should not be to cut through the noise and grab people’s attention from competing chatter. The solution to overwhelming chatter is not to reduce the quality of data visualization to entertainment but to reduce the noise that surrounds us. Visualizations strive to reduce noise in the information itself so that signals can be seen and understood. Attempts to design visualizations primarily to grab people’s attention from competing chatter will diminish their ability to cut through the noise in the information itself.

The New York Times produces infographics that work well because they design them primarily to inform (unless its an entertainment piece), not primarily to entertain. They have made a conscious decision to respect their readers’ intelligence. People who want only to be entertained by pretty pictures don’t read the New York Times. I think the approach of the New York Times better serves society and honors the importance of information.

By DR. June 30th, 2011 at 11:31 am

I appreciate the response.

However I think I need to clarify.

First: enlighten vs. inform. Indeed, to enlighten one must inform. McCandless’ chart does enough informing to enlighten me on a few points. I suspect that’s all he was looking to do. Could he do more? Should he do more? Absolutely. I agree with everything you say on that. But by stripping away anything attention-grabbing, your bar chart sadly does neither.

Second: The bar chart neither informs nor enlightens because I wouldn’t stop to look at it. I agree 100% that “the solution to overwhelming chatter is not to reduce the quality of data visualization..”. Again, I’m not defending the poor sensemaking quality of McCandless’ graphic. However you say “the primary goal of data visualization should not be to cut through the noise…” and this just strikes me as naive. Perhaps this is the case in much of the work you do with dashboard design wherein the user is expected to provide you with undivided attention. But as you yourself point out, McCandless is operating in journalism space – where he better cut through the noise lest he be tossed into space with all the other ineffective information sources. In this sense, his graphic succeeds here where a bar chart would not. That alone is my core point.

Finally regarding the NYT: I didn’t say they produce infographics primarily to entertain. Indeed, they produce to inform – but they seek, and find, the balance. They recognize that to gain readership they need to capture attention. They realize they are competing with the likes of this:
(picture taken from my chair seconds ago).
And their work clearly and regularly demonstrates this simple fact.

By DR. June 30th, 2011 at 11:33 am

(oop, image stripped out.
http://i792.photobucket.com/albums/yy201/DR_phot/photo-1.jpg
Was just a picture of the kind of information CNBC throws at me every second of every day. There is not a financial house in the country that doesn’t have a 60″ LCD screen tuned to this as we speak.)

By Stephen Few. June 30th, 2011 at 12:37 pm

Your response confirms my concern. We appear to be living in a time when you and many others will not attend to an information display unless it exhibits eye-catching visual effects, even if those effects diminish clarity, accuracy, and breadth of understanding. You claim that you will not even look at a chart that displays information simply (which in part means no unnecessary visual effects), clearly, accurately, and in a way that leads to greater comprehension.

My insistence that the primary role of data visualization is not to cut through the chatter and grab your attention but to inform is not a sign of naivete but of principle. I believe in the potential of data visualization to enlighten far beyond the low-density and often inaccurate messages that McCandless’ charts achieve. Designing data visualizations to feed your demand that something scream loud enough to be heard above the chatter is a practice that undermines the power and goal of data visualization. As a practitioner who has dedicated his professional life to data communication and sense-making, I have no choice but to oppose this current trend.

If we redefine the primary goal of data visualization as attention-getting to cut through overwhelming chatter, the chatter will only increase. As it does, what will we require next of data visualizations to cut through the even greater chatter? Perhaps McCandless’ heir will develop a technique for shooting bolts of electricity from charts directly into the hearts of readers to stun them into attention. People need to be informed. The answer to distracting chatter isn’t to scream louder and louder. The solution to a world filled with ever-increasing noise is to reduce the noise.

By DR. June 30th, 2011 at 1:42 pm

At no point did I say I will not look at something that displays simply. I said I will not look at something that is not engaging. Very, very different.

For example, the following caught my eye. Simple AND engaging. (And certainly not “redefining the primary goal of data visualization..”)
http://www.nytimes.com/imagepages/2010/05/02/business/02metrics.html
It’s just a line chart, but it captures the mind because of that back-track at the end. They’re not yelling. They’re not shooting lightning bolts. They’re just making a little effort to find a way to be engaging.

Though I call everything around me noise (which sounds negative) I don’t consider it negative – in fact, it’s 2011; it’s required. Competition is fierce. I know nobody who does one thing well and is successful (in private industry). To be successful, one must do many things well. Outside of academia, clutter is the reality.

But one does not need to be louder to get through, one merely needs to be better. And frankly if you want me to be interested in something that I wouldn’t otherwise bother with (which to some extent is one goal of a good journalist – I really don’t care about the UK’s pension spending), the onus is on *you* to make me care.

Again, I have no disagreement with your critique of McCandless. I think you’re spot-on. Rather, put yourself in his shoes – as a journalist – and ask yourself if the bar chart you propose is sufficient. You may not like the distractions of modern life, but a journalist’s work has to stand up among them. Does this example do that well? Does it make me care? I believe it’s unfair to rewrite this graphic without that context.

By Andrew. June 30th, 2011 at 2:29 pm

I’m noticing a trend among supporters of McCandless-esque infographics. Many apparently think that data visualizations need to “grab the attention” of people who otherwise don’t care, and this property seems to be the only basis for preferring McCandless’ impoverished treemap over a simple bar chart.

I can’t seem to figure out why engaging apathetic passers-by is a higher priority than informing people who already do care, are already interested, and will already be engaged when the data are presented such that the viewer can make sense of them. When you put McCandless’ “art” in front of them, it’s like a slap in the face. So much information is hidden from the viewer because the data are presented with inappropriate display media; all the same numbers are there, but data visualizations can (should) encode a lot more information than numbers (easy example: rank, which the viewer must search for and discover on their own in McCandless’ version). When such useful information is hidden, the viewers are left with only clues, and must recreate it themselves.

If the priority is grabbing attention, there are much better ways than obfuscating the data with “art”, as McCandless work seems to do.

By Stephen Few. June 30th, 2011 at 2:34 pm

Until now, all of your examples of what makes something “engaging” in a way that cuts through the noise and grabs your attention has involved visual elements that undermine a chart’s effectiveness. In other words, they have involved examples of shouting to be heard.

McCandless’ chart is not simple. He sacrificed simplicity for eye-catchiness. A simple chart is one that eliminates all that isn’t essential to the story and displays the information in a way that is as easy to decode and understand as possible. You have not given examples of charts that grabbed your attention because they were “better.” How is McCandless’ chart better? It is not better, it is more eye-catching, because it involves large globs of color. That’s it.

It is entirely fair to judge the merits of McCandless’ work by the standards of fine journalism. I am not alone in critiquing McCandless’ work and finding it wanting. The best practitioners of journalistic infograhics tend to share my opinion. They find ways to tell stories in meaningful, effective, and engaging ways without compromising the integrity of the work. For instance, an engaging photo, diagram, or even cartoon could be used to draw someone into the story. Charts don’t need to be spruced up in entertaining but compromising ways to achieve this effect.

The bar graph that I proposed as a better version of The Billion Pound-O-Gram was not meant to stand alone as a journalistic piece. I I were to tell this story, I would interweave words, charts and perhaps other images to tell this story in a way that was both engaging and fully informative.

By Stephen Few. June 30th, 2011 at 3:18 pm

Assuming that we are perhaps nearing the end of this particular discussion, let me quote the words of a man who shares my concerns:

If an editor should print bad English he would lose his position. Many editors are using and printing bad methods of graphic presentation, but they hold their jobs just the same. The trouble at present is that there are no standards by which graphic presentations can be prepared in accordance with definite rules so that their interpretation by the reader may be both rapid and accurate. It is certain that there will evolve for methods of graphic presentation a few useful and definite rules which will correspond with the rules of grammar for the spoken and written language. The rules of grammar for the English language are numerous as well as complex, and there are about as many exceptions as there are rules. Yet we all try to follow the rules in spite of their intricacies. The principles for a grammar of graphic presentation are so simple that a remarkably small number of rules would be sufficient to give a universal language.

These words are from the book Graphic Methods For Presenting Facts by Willard C. Brinton. He wrote them in the year 1914. At the time, it was true that “useful and definite rules” for graphical presentation had not yet evolved. Today, this is no longer true. They will certainly evolve further, but not very quickly if we continue to repeat the mistakes of the past.

It is this sense of history that adds fuel to my passion. We can do so much better. We should do better, but we backslide every time we become enamored of some new technology or some new group that involves itself with data visualization without learning the lessons of the past. How many steps backward must we take before we regain forward momentum?

By DR. June 30th, 2011 at 3:42 pm

There is a point I’ve attempted to make a few times that I seem unable to articulate. I’d like to try one more time, then leave you to your afternoon.

Never, not once, have I defended McCandless’ chart from a sense-making standpoint. It is not a good infographic. I never said, or meant to infer, that it was. My example of a better graphic was a NY Times piece (in black and white no less!). I think we can agree McCandless needs to be a better steward of his data. That ship has sailed. However, he excels in the story-making realm. This explains his popularity even in the face of sometimes meaningless graphics.

Perhaps we can boil this down to the following question:

**Infographics must inform. Do they need to do anything more?**

I’d be curious to hear your (and Andrew’s) response.

In the realm of journalism (the realm we’re discussing here), my answer is yes. Absolutely.

McCandless informs – albeit lightly, and poorly – but does do “more” either aesthetically or by telling a story.

Your bar chart inform but goes no further.

In my mind, both pieces fail.

If I were an editor I would struggle to choose between those two options. My thinking would be:
-” The McCandless option looks good, and my readers will stop and look at this. But ugh, it doesn’t actually make sense.”
-” The Few option has the right information, but ugh, the ad printed to the left will garner more attention.”

You mention that your piece was not meant to stand alone – however the blog post is title “…redesigned”. So forgive my confusion. But a redesign should shoot for the same goals as the original – no?

By Stephen Few. June 30th, 2011 at 4:20 pm

Journalistic infographics should support the telling of a story. They need not and usually cannot tell the entire story by themselves. In the case of journalism, telling a story is what must be done to inform. My problem with McCandless’ infographics is that they don’t support the telling of a story well.

The title of this blog piece is “The Billion Pound-O-Gram Redesigned.” I understood “The Billion Pounnd-O-Gram” to be McCandless’ description of the chart. My intention was to reproduce his chart, not the entire story. I’ve described in broad strokes what I would do if I retold the entire story, but won’t bother actually doing this, because time is short and I don’t find the story itself interesting enough to bother.

By Andrew. June 30th, 2011 at 4:25 pm

> Infographics must inform. Do they need to do anything more?

I agree that they can and should do more than inform, but I don’t think it is necessary to sacrifice informational aspects (as McCandless does) in order to do more.

But personally, I have to wonder what “more” that McCandless’ charts can do. For starters, I’m curious as to why people say he is an effective story-teller. How so? The same stories seen in his graphic are easily conveyed in a mere glance of Stephen’s bar chart. If McCandless’ story-telling abilities are at all notable, shouldn’t they exceed the “traditional” methods?

As for grabbing attention or engaging the observer, clearly his charts do this (they even grab Stephen’s attention, if for the wrong reasons). But I don’t share others’ belief that this is such a high priority. Those who care about the information will give their attention and be genuinely-interested, and will want to learn more (and will therefore need displays that can inform them). People who care more about the ad printed to the left probably don’t care about the data or being informed anyway, so grabbing their attention would be worthless. That is, unless good journalism is all about getting the most viewers/readers. ;)

By DavidH. June 30th, 2011 at 9:41 pm

I must admit that the design of the McCandless graphic initially grabbed my attention. I studied the chart for a while before I realized that it wasn’t showing parts of a whole and that items like “Africa’s entire debt to Western nations” were completely arbitrary– McCandless could have just as easily used the value of all the oil Saudi Arabia.

That realization led to an overwhelming feeling of being duped, and had I stumbled upon this graphic somewhere besides this forum I would have been pissed off at McCandless for wasting my time.

Some of the supports of McCandless seem to feel that the goal of infographics is to provoke an emotional response in viewer, not to inform or encourage intellectual exploration discourse. It appears to me that they are confusing art and journalism.

By Jamie. July 1st, 2011 at 6:14 am

There is one other point I think is relevant to the comparison between these two visualizations.

I think we can (mostly) agree that an uninterested reader is not going to spend much time looking at either chart.
I think that methods of engaging a reader and developing that interest is a worthy goal.
I don’t think that either chart is going to do that, but more importantly, a chart shouldn’t stand alone to do this…the bad tree map is no more an ‘infographic’ than the bar chart.

The point to be made, however, is that the casual reader spending only a few seconds viewing each of these visualizations will walk away more informed having viewed the bar chart than from the other.

The interested viewer who already wants to know will clearly walk away more informed.

So if both the uninterested viewer and the interested viewer will be more informed by the bar chart…what argument is there?

By Pete. July 1st, 2011 at 6:20 am

>
If I were an editor I would struggle to choose between those two options. My thinking would be:
-” The McCandless option looks good, and my readers will stop and look at this. But ugh, it doesn’t actually make sense.”
-” The Few option has the right information, but ugh, the ad printed to the left will garner more attention.”
>

DR, are you an editor for UK’s “The Sun”? Is there a need for a plunging neckline in an article about how a billion pounds are spent?

As for concerns about an ad distracting attention away from Few’s version, I like one reader’s comment posted in reference to the online version of the Guardian’s article:
“I can’t read the f[****] graphic because of a Lowe’s ad. Wonderful”

By Stephen Few. July 1st, 2011 at 10:04 am

I think I’ve discovered why McCandless’ pseudo-treemaps don’t always get the proportions right. In every example of a treemap that I’ve found on his site, one of the following conditions is exhibited:

1. The lower right corner includes empty space
2. The lower right rectangle is out of proportion
3. Spaces of irregular sizes were inserted between the rectangles throughout the treemap

I think all three conditions indicate that his space-filling treemap algorithm simply doesn’t work. It always produces either empty space in the lower right corner or a final rectangle in that location that is too large, which he then reduces in size to get it to fit into the larger rectangle. When he places irregular spaces throughout the treemap, as he did with his natural gas example, he made the squares fit into the larger square by moving them around, resulting in the irregular areas of white space in between them.

I suspect that in the early days of his budding interest in data visualization McCandless saw a treemap and thought it was cool, and then wrote his own space-filling algorithm (he is a programmer, after all), rather than taking time to learn about treemaps and adopt one of the many available space-filling algorithms that researchers have developed. This would also explain why he uses a treemap for values that aren’t parts of a whole. He simply doesn’t understand them. Because he’s known as a data visualization expert, people assume that what might otherwise be seen as mistakes are actually intentional acts of artistic license.

I can’t know for sure that this is true, but evidence suggests that it is. If anyone finds evidence to the contrary, please share it.

By Anders. July 4th, 2011 at 1:01 am

Good call on the areas, but I do believe it’s a simpler explanation: It just looks good (to McCandless). As a programmer, he should have the mathematical skills to calculate the areas correctly; it’s not that difficult. Since he fails to plot correctly even in one dimension, I suspect it’s because it looks good. An example: See http://www.informationisbeautiful.net/play/snake-oil-supplements/ where the Strength of evidence seems to be a continuous variable; by eyeballing it, I can count at least 30 different values on the y-axis. However, checking the data, there are only 10 discrete levels (0-6, some in half-steps). There is no way that erroneous plotting due to a bad algorithm.

Many things that pass for infographics these days uses ineffective graphing techniques are naive and dumb down complex issues in an attempt to engage the ignorant. Much so with McCandless, for example the Billion Pound-O-Gram is not an effective graph, and it lacks the most important variable to make sense of the budget deficit; the total size of the budget, as pointed out on this blog. The world is multivariate, so comparing data and putting them in a meaningful perspective is the most basic task in sense-making. McCandless adds a “pseudo-multivariate” feel to it to counter his dumbed down selection of data by using different shapes and colours (the shades of green in the Billion Pound-O-Gram does not code any information).

However, the typical McCandless infographics have some other features that are somewhat more unique to him: The total disrespect for or lack of understanding of the data. The “Planes or Vulcano” (http://www.informationisbeautiful.net/2010/planes-or-volcano/) graphic is not calculated correctly (it’s a snapshot of one day, almost worse case scenario for the cancelled planes and not the complete emission from the volcano), does not it include an estimate of how much extra CO2 emitted due to increase of alternative transportation or how much of the travels simply where postpone. The worst example I’ve found so far, because it’s closer to me, is the Snake Oil graphic. It’s supposed to be a summary of the scientific evidence for health supplements, and he even puts in a “worth it” line. But the general consensus among scientists is that health people with a recommended/ normal diet do not need any supplements. So where does he get that line? A quick look at the data reveals that McCandless does not know the different between lab bench experiments, supplements given to counter a illness/ deficiency and healthy people taking supplements in combination with a normal diet.

In short: Even if the reader can see past the ineffective graph, naive comparisons and wrong areas/placement along axis and manage to extract some information to take away, chances are that this information in many cases is wrong. That’s why a redesign as shown in the blog-post above does not make a good infographic; it does highlight the ineffective charting technique, but don’t deal with the more serious, underlying flaws.

The Billion Pound-O-Gram Redesigned

31 Comments on “The Billion Pound-O-Gram Redesigned”

Archives