A Pie in the Face for Information Visualization Research
Even though several research studies over the years have sought to compare the relative effectiveness of pie charts vs. bar graphs, only for one task have bar graphs failed to outperform pie charts. The one potential advantage of pie charts was identified in a study by Spence and Lewandowsky titled “Displaying Proportions and Percentages” (Applied Cognitive Psychology, Vol. 5, 1991). This study has probably been cited more often than any other to support the pie chart’s worth. I suspect that most of these citations, however, were made by researchers who never actually read the original paper, so they tend to give pie charts more than their due. In all fields of research, not just information visualization, studies are routinely cited that weren’t actually read, resulting in misrepresentations of the original work’s findings. According to a study by Mikhail Simkin and Vwani Roychowdhury, only about 20 percent of scientists who cite an article have actually read the paper (“Read Before You Cite!”, Complex Systems, 14 , 2003). In most cases, researchers have only read comments in secondary sources about the studies that they cite—sources that were often written by others who also relied on secondary sources. This is one of the ways that errors proliferate and sometimes become common knowledge, even in scientific circles.
Few researchers bother to mention that the study by Spence and Lewandowsky robbed bar graphs of their quantitative scales. Perhaps, because pie charts lack quantitative scales, Spence and Lewandowsky felt that scales should be removed from the bar graphs to even the playing field. In fact, a pie chart has an implied scale that goes from 0% to 100% in a circle around the perimeter of the pie, but it is never shown because it isn’t helpful. By removing the scales from bar graphs, however, their study failed to measure the effectiveness of bar graphs as actually used.
Nevertheless, even when hamstrung in this way, bar graphs performed better than pie charts for every task except comparisons of summed parts. Imagine a pie chart with four slices, labeled A through D, and a bar graph with four bars, one for each of the same values. Now imagine the following task: either compare the sum of slices A and B to the sum of slices C and D to determine which is greater or perform the same comparison using the corresponding bars in the bar graph. The study found that test subjects could estimate the sums of two slices and compare them to the sums of another two slices more effectively than they could estimate and compare the combined lengths of bars. This isn’t surprising, but even this one advantage of pie charts might not have been found had the bar graphs possessed their scales.
Comparing the lengths of two bars that share a common baseline is handled by the visual cortex of the brain in a preattentive manner that is fast and as precise a comparison as visual perception supports. Comparing the sizes or angles of pie slices is also handled by the visual cortex, but not as precisely and usually not as quickly either, because we typically strive for a level of precision that the pie chart doesn’t support, which slows us down. Decoding the value represented by a slice of pie requires us to estimate the percentage of the circle that belongs to the slice, which is difficult. Decoding the value represented by a bar involves a straightforward lookup: we compare the end of the bar to the nearest value along the scale. When a bar graph is properly designed, we can perform this task quickly, easily, and precisely.
The fundamental superiority of bar graphs over pie charts is rooted in a fact of visual perception: we can compare the 2-D positions of objects (such as the ends of bars) or their lengths (especially when they share a common baseline), more easily and precisely than we can compare the sizes or angles of pie slices. When people like Edward Tufte, William Cleveland, Naomi Robbins, and I express disdain for pie charts, it is for this reason and this reason alone. We love circles as much as anyone, but we don’t worship them and we don’t expect from them what they can’t provide.
Despite the perceptual problems associated with pie charts, which are well established, every once in awhile some new study or book comes along and suggests that the experts have been wrong all along. Even when utterly absurd and completely unfounded, lovers of pie charts, especially software vendors that promote silly, ineffective data visualization practices, celebrate these studies: “Mission accomplished! We have proven the worth of our beloved pie.” To quote the conclusion of a recent journal article: “The pie is a communication chart par excellence…pies are from Venus, bars are from Mars” (Charles Wesley Ervin, “Pie charts in financial communications,” Information Design Journal, 19:3, 2011). People love circles, there’s no doubt about it, but they are rarely useful for displaying quantitative information.
A few days ago I discovered the latest paper that gives an undeserved thumbs up for the pie chart: “Using fNIRS Brain Sensing to Evaluate Information Visualization Interfaces,” written by students and faculty at Tufts University. I discovered this paper when reading a blog post that cited my negative opinion of pie charts and then pointed to this paper as potential evidence of my error. This paper has been accepted for presentation at CHI 2013 in Paris later this year and is already available in published form. This study is misdesigned, misinterpreted, and misrepresented. I wish I could say that this is an anomaly, but sadly, I cannot. If you are not intimately acquainted with academic research, you might assume that most of it is well done, and that getting published is a sure sign of credibility. This is far from true. Bad research gets published in every field, but in the field of information visualization, it sometimes even wins awards.
The following bold statement appears in the paper’s abstract (emphasis mine):
In this paper, we use the classic comparison of bar graphs and pie charts to test the viability of fNIRS [functional near-infrared spectroscopy] for measuring the impact of a visual design on the brain. Our results demonstrate that we can indeed measure this impact, and furthermore measurements indicate that there are not universal differences in bar graphs and pie charts.
In fact, this study demonstrates nothing of the kind. It does not meaningfully measure the impact of visual design on the brain, and it definitely does not indicate anything universal or even otherwise about differences between bar graphs and pie charts. The primary problem with this study is the fact that it did not simulate any of the actual tasks that people perform when using bar graphs and pie charts. This is only apparent, however, if you read beyond the abstract.
I’ll describe the tasks that test subjects performed. See if you can identify the problem. Subjects performed multiple series of tasks. Each time they were shown 11 slides in sequence, lasting 3.7 seconds per slide. Each slide in a particular series displayed either a single bar graph or pie chart. Each chart displayed multiple bars or slices. Among them, one bar or slice was marked with a large black dot and one with a small red dot. The subject was required to compare the length of the bar or size of the slice marked with the red dot to the length of the bar or slice of the pie marked with the black dot on the previous slide. Items marked with black dots always represented values that were greater than those marked with red dots. The subject’s task for each slide was to estimate how much larger the item marked with the black dot on the previous slide was compared to the item marked with the red dot on the current slide, to the nearest 10%. In other words, they would indicate that it was approximately 10% greater, 20% greater, 30% greater, etc., which they did by pressing an appropriate key on a keyboard. After making this choice, they then had to quickly look at the item marked with the black dot in the current slide before the 3.7 seconds were up so they could remember it when the next slide appeared and they were required to compare it to the item marked with the red dot there. The figure below shows an example of three slides in an eleven-slide series, in this case consisting of pie charts:
Think about this task. Is this what we do when we compare values in bar graphs or pie charts? It isn’t. What’s different from our actual use of these charts? The things that subjects compared were never simultaneously visible.
When we use a chart to compare either slices or bars, we almost always compare values within a single chart. The values are right there near one another, which allows the visual cortex of the brain to handle the comparison. On less frequent occasions when we compare values that reside in separate charts, we always put those charts in front of our eyes at the same time, such as in a trellis display. This is a fundamental practice of data visualization. Why? Because, if the things that we need to compare are not simultaneously visible, we must rely on working memory, which is extremely limited. Work is transferred from the visual cortex to working memory—from our strength to our weakness—which is just plain dumb.
The designers of this study created a task that was handled by working memory because they wanted to demonstrate the usefulness of fNIRS technology for data visualization research and fNIRS can only measure neural activity in the prefrontal cortex, not the visual cortex. They created an unrealistic, artificial task. In doing so, they created something to measure in the prefrontal cortex, but it had nothing to do with a realistic use of charts.
This study was not actually designed to compare the effectiveness of bar graphs vs. pie charts, yet it makes the claim that “there are not universal differences in bar graphs and pie charts.” Instead, this study was designed to demonstrate a use for fNIRS technology in the field of data visualization research. It failed to achieve the latter and should have made no claims regarding the former.
Only one potentially meaningful finding should have been claimed by this study: a positive correlation between test subjects’ subjective sense of difficulty associated with the use of bar graphs vs. pie charts and hemoglobin oxygenation levels in the prefrontal cortex. Subjects who felt that bar graphs were more difficult exhibited higher levels of oxygenation when using bar graphs. Those who felt that pie charts were more difficult exhibited higher levels of oxygenation when using pie charts. This tells us nothing about the relative effectiveness of bar graphs vs. pie charts. Subjects’ preferences for one type of chart over the other might have been a predisposition, but predispositions were not tested. Whether or not a predisposition existed, we don’t know if test subjects’ sense of difficulty and higher levels of oxygenation have any relationship to the effectiveness of the charts. What the experiment found is that working memory performed equally well (or equally poorly) regardless of the chart that was used.
This and other studies done at Tufts University interpret higher levels of hemoglobin oxygenation in the prefrontal cortex as “cognitive load,” by which they imply “cognitive overload.” A negative connotation is assumed. Measuring hemoglobin oxygenation levels in the prefrontal cortex may be a valid measure of brain activity, but we have no reason to believe that this activity is necessarily negative. Perhaps high levels of activity correlate to greater insights rather than counterproductive overload. In truth, oxygenation levels probably indicate neural activity of many types: some positive and some negative. To date, we don’t know how to discriminate between them.
The use of neuroimaging such as fNIRS in HCI studies is still in its infancy. fNIRS may be useful, but we must be careful to read no more into these measures than our current understanding can actually support. Using fNIRS to interpret neural activity is a bit like using temperature readings inside a building to determine the specific activities that are going on within, even though we are separated from those activities by a solid, opaque wall.
The authors of this study indicated the need for caution, but notice how they failed to heed this concern (emphasis mine):
During the course of this paper, we have been intentionally ambiguous about assigning a specific cognitive state to our fNIRS readings. The brain is extremely complex and it is dangerous to make unsubstantiated claims about functionality. However, for fNIRS to be a useful tool in the evaluation of visual design, there also needs to be an understanding of what cognitive processes fNIRS signals may represent. In our experiment, we have reason to believe that the signals we recorded correlate with levels of mental demand.
Notice the reasoning here. We can’t assign specific cognitive states to fNIRS readings, but these readings are useless to us unless we can assign specific states to them, so we’re going to do so. After the disclaimer, they went on to declare:
Our findings suggest that fNIRS can be used to monitor differences in brain activity that derive exclusively from visual design. We find that levels of deoxygenated hemoglobin in the prefrontal cortex (PFC) differ during interaction with bar graphs and pie charts. However, there are not categorical differences between the two graphs. Instead, changes in deoxygenated hemoglobin correlated with the type of display that participants believed was more difficult.
“Differences in brain activity that derive exclusively from visual design”? What they actually found were differences related to subjective feelings of difficulty and oxygenation levels associated with those feelings, which they assumed were “derived exclusively from visual design.” It is entirely possible, however, that those subjective feelings were derived from dispositions regarding bar graphs vs. pie charts that did not grow out of differences in visual design.
Because fNIRS can only measure activity in the prefrontal cortex, not the visual cortex, the authors acknowledge that it is only potentially useful for measuring more complex tasks that involve the prefrontal cortex.
We find that fNIRS can provide insight on the impact of visual design during interaction with difficult, analytical tasks, but is less suited for simple, perceptual comparisons.
Even this statement contains an error. Remembering the size of a slice or bar so it can be compared to another slice or bar later is indeed a difficult task because of working memory’s limitations, but is it an analytical task? Does it require reasoning? It is entirely a task of memory. The prefrontal cortex handles many tasks, but we cannot currently use fNIRS to specifically measure analytical tasks because it cannot discriminate among different neural activities.
Research studies like this should prompt us to ask several questions, including:
- How can students earn PhD’s while focusing on information visualization without first learning the fundamental skills required of the discipline (best practices of graph design, the basic tenets of the scientific method, an understanding of visual perception and cognition, and critical thinking)?
- Do the professors who participate in these studies and the reviewers who approve them also lack these skills?
- Do the professors who advise these students review these studies carefully?
- Why aren’t researchers in information visualization asked to go back and correct their work prior to approval for publication based on feedback from reviewers?
I am not writing about this particular study because it is extraordinarily bad, but merely because its claims address topics of interest to me. This paper is typically bad. The problems that we see in it arise from deeper problems that are both endemic and systemic. Papers get published and awards are given when studies exhibit novelty or make controversial claims. A study that tests a hypothesis that turns out to be false is rarely published, even though it is still informative. A study that tries to replicate a past study to confirm or deny its findings is considered boring and thus avoided. Student in doctoral programs are encouraged to find something sexy. Sometimes this takes the form of studies that supposedly challenge long-established best practices. When you’re a young up-and-comer, it’s exhilarating to take a leader in the field down a peg or two. What academics sometimes forget, however, is that their work affects the world. People trust their findings and make decisions based on them. When studies make erroneous claims, they do harm. Research should be better reviewed for the merits of its content. We need fact checkers; not after the fact, such as this review that I’m writing, but prior to publication. Students should receive corrective guidance during the course of their research rather than being subjected to corrective reviews like this post-publication. The bar must be raised, but that won’t happen until academics themselves become willing to speak up.
Take care,
17 Comments on “A Pie in the Face for Information Visualization Research”
You write: “When you’re a young up-and-comer, it’s exhilarating to take a leader in the field down a peg or two.”
After reading your post, it’s apparent that—when you’re a leader in the field—it’s exhilarating to take a young up-and-comer down a peg or two. Clear critique is constructive. Shaming is not.
Jonathan,
In your opinion, what in particular have I said that you feel is inappropriate? What have I said that didn’t need to be said?
Steve, you’ve only begun to say what needs to be said. I applaud your courage, and believe you would like nothing more than for everyone to join you at the summit of Mt. VBI. You are closer than any of us. Those coming up after you may succeed, or they may fail, but they are without question following your lead. Experienced and caring expedition leaders will not hesitate to tell any climber (young or old) they are making a stupid mistake, or putting others at risk with their selfish acts. It’s not ego, it’s simply the right thing to do. Keep it up!
Stephen,
Several comments/ideas come to mind:
First is that I wonder what kind of fNIRS readings would look like if one could compare the business decision errors that ensue from using Piecharts vs Bar graphs in their ‘typical’ applications. A poor method for assessing what brain activity measurments look like for negative consequences.
Second I read through the paper and many more questions get raised, is the sample population representative? How does age, cultural background, education, etc. affect how the prefontal cortex handle this kind of stimulus?
Third, in spite of the unsupported conclusions and inconsistencies I can see that with proper experimental design and alternative experiments/data gathering fNIRS could be a powerful tool to add studies of how we process information.
Lastly it will be interesting to see how the authors of the paper respond to your observations if at all.
In response to this blog post I received an email from a fellow who works as a data analyst for SAS. His only words appeared in the subject line: “I think I can convince you that there are good uses for Pie Charts.†I responded, “Go for it.â€
This was his reply:
Before I take the time to comment on this, take some time for yourself to look it over. Is this the best way to display this information? Does this stacked pie chart tell the story better than any other form of display?
I asked for the data that was used to build this chart, because I wanted to suggest an alternative. There was no way that I could have decoded this chart to get the values that I needed. Imagine the effort that this would have taken. Once I had the data, I wrote the following:
Here’s what I received in response:
“Not always the worst†is not a rousing endorsement of pie charts. I responded:
More discussion followed, but you get the gist of it. People defend ineffective forms of display such as pie charts for various reasons. It appears, based on further discussion with him that this particular fellow is defending pie charts because he’s been using and promoting them for years.
People rely on me to teach them best practices: ways of visualizing data that help them think and communicate more effectively. Discussions about which visualization works best should not be a battle of opinions or egos; it should be about what will enable people to work most effectively with data to support better decisions. Whether you’re a data analyst, such as this fellow from SAS, a student trying to earn his doctorate in information visualization at a university such as Tufts, or anyone else who hopes to contribute to better uses of data, this should be our common goal. We should figure out what works, not based on opinion or preference, but based on good science. Anything less is irresponsible.
As an addendum to my comment above about the SAS employee who believes that his stacked pie chart is an effective solution, here’s the latest email that I received from him a moment ago.
Notice that he’s framed the argument in a way that dismisses my position as mere theory and elevates his as valid practice. This frame is a delusion. He isn’t able to see beyond his own biases. In the last email that I sent to him earlier today I suggested the following:
How do I know that this is what he would find if he answered these questions with clear eyes and an open mind? Because this is what we’ve found when we’ve put these graphics to the test. This isn’t mere theory or opinion, but solid evidence, confirmed through many years of practice and careful attention.
At no point during our correspondence did he respond to any of the evidence that I provided. Instead, he consistently relied on his observation that people instantly understand his stacked pie chart and are dumbfounded by bar graphs. Does this ring true to you?
I shouldn’t get so worked up about this. I just don’t want to believe that people can be this thick-headed. I guess I should open my own eyes and realize that some people are beyond reason and will always believe what they want to believe, even when the evidence says otherwise.
Stephen – I often compare the staunch pie chart defenders to the extremist creationists who fight in such irrational ways against evolution, or climate change, and other sound science.
The same mentality is apparent, the same refusal to accept evidence.
Stephen,
I do run into situations where people show a preference for pie charts and when this happens I start asking questions similar to those you described to the SAS Analyst, usually the reaction is one of ‘wow, you are right I can get more information’.
What is alarming to me is someone from an organization like SAS using preference as a criteria when the purpose of creating graphs, charts, etc. is cognition. This is a similar logic error present in the Tufts University paper equating pre-frontal cortex activity with cognition.
Two points regarding your correspondents pie chart:
1) Moire vibrations – yuk!
2) I have absolutely no idea whether I am supposed to be judging the relative numbers by area or by radial extent.
Stephen, like most of your opinions, they are 90% right. I have a small caveat.
You give very nice examples of how bar charts clearly communicate the value of individual measures relative to a set. The pie charts in the context of small multiples can create visual distortions. Great demonstration.
I have an emotional attachment to these suckers. It’s as simple as that. Defies cognitive science, intellect, proof.
I just want to see something that shows me that these individual measures add up to 100% of the set that is being discussed.
Members of a set look lonely and unsocial sitting next to each other in a bar graph. Like a bad party where the DJ is playing the music too loud.
Pies are bad for me, I know. I always regret them in the morning when my friends make fun of me. I’m cutting back but I can’t get myself to erase their number for good in my contacts list. They were my first data experience. And sometimes, late at night, it gets lonely. I… I just want my data to visually merge. No matter how late, they always pick up the phone.
Michael,
I appreciate the humor of your comments. Bars are not sexy. An emotional attachment to pie charts is common. For purposes of data sensemaking and communication, in an attempt to make better decisions, however, we must reach beyond emotional attachment to progress. To move forward, we must allow System 2 thinking (slow, conscious, analytical, deliberative) to evaluate and, if necessary, override the preferences of System 1 thinking (immediate, unconscious, intuitive).
Regarding my opinions (or what I believe can more accurately be described as my understanding, based on science), does the 10% that in your opinion is inaccurate all fall into this category of your emotional attachments? If even 1% of what I teach is not aligned with the best evidence available, I’d like to know so I can correct it. If you have any examples of things that I teach that are inaccurate based on scientific evidence or reason, please let me know.
Whenever I use bar graphs to display parts of a whole, I make it clear to the audience that the bars add up to the whole of something. This is necessary, because, unlike pie charts, bar graphs can be used for purposes other than showing parts of a whole. In this particular case, I didn’t bother, because in the context of this blog post, the fact that the parts (jobs in this case) add up to all jobs is already known. Even if someone stumbled onto my bar graph out of context, the labels “Unknown” and “Other” would make it clear that all jobs are accounted for.
No Stephen, I was just being glib. I’ll let you know if I have a scientific quibble. Thought a little humor might help when dealing with a System 1 world.
Mr. Few
I am involved with professional football and work with data. I see many Americans spend a lot of investment into analytics and use data visualisation techniques that are not mature. In France we have a problem of not having our own beliefs and hire outside experts from England or Spain, but we we know bad art when we see it!
Bonjour
Stephen,
When I was relatively new to data viz and I came to the independent conclusion that pie charts were not generally useful. I began to see that a pie chart was only clear around 2-5 categories. Any more groups and the slices are indistinguishable from each other.
I think it’s going to be hard to stamp out the two category pie chart as it because it seems to me that pie charts are most effective with the fewer groups you use. (If that’s the case then maybe a zero category pie chart is the best to use). I have created a side-by-side comparison. For fun I put 3D bar charts in too. Maybe your correspondent is thinking of them? In the case I present, a 2-d two category pie chart is better than a 3D cone chart. :)
(Click to enlarge.)
Steve,
I’m convinced of your argument. Two comments:
Frst, the only rationales i have for the enduring appeal are 1) they are familiar and thus comforting and 2) since they can only be used for one purpose, I.e. showing part to whole relationships, versus bars which can be a variety of things, preattentive recognition allows users to instantly know what they are looking at and how to interpret it. These two points are related. Claims of pie popularity are likely based on such advantages beating out the measurable cognitive performance advantage of bars.
This maps to a foundation graphic design principle: the distinction between readability and legibility. Legibility is the measurable ability of humans to decipher text. Readability is effect of the entire graphic presentation on the viewers inclination to read it in the first place. The latter involves seduction to entice the reader to invest. Perhaps something about pies related to familiarity and instant recognition of its purpose contributes to the viewers inclination to engage with the content because they know what is expected of them.
The second comment, also related to the first, points to a sobering conclusion for precision and performance oriented geeks like you and me: that in fact many consumers of quantitative data do so more for the feeling that they Are being precise and informed than for the actual benefit that such learning delivers. Such attitudes might persist until the performance gain of bars versus pies results in advantage (e.g. money) won and lost at meaningful scales. I’ve never Seen a pie chart on a Wall Street trading platforms. Ever.
Cheers
John
Q – When will pie charts cease to be ubiquitous?
A – When M$ Excel removes pie charts from the available chart types.
I often struggle trying to convince people that pie charts aren’t particularly helpful, and that another, well designed graph or even a table is more useful. I’ve illustrated this many a time, and frequently people listen. But a surprising number of people just don’t really care.
I suspect that for a lot of people they don’t think sufficiently abstractly or critically to really intuit why these graphs are more useful. And the comfort of existing displays allows them to continue down the well worn paths of their own mental model, with the pie chart as a shield against criticism (see, I do use evidence!) rather than a useful aid.
I often have this type of argument with those from accountancy backgrounds about when its better to use a graph than tables of numbers. Even when reviewing strategic goals they love to see the numbers so they can add them up to check total and verify it all reconciles, but they reluctantly use a line graph. It seems to just mean less to them. Though as an operational researcher, I tend to think of numbers in terms of ‘sentences’ rather than individual ‘words’. Or maybe more like verbs, than nouns. Or something ;-)
Thank you for the post. I’m going to use it to poke fun at the next SAS consultant I meet.