Review of the Research Study “What Makes a Visualization Memorable?”
Michelle Borkin, et. al. (Harvard School of Engineering and Applied Sciences and MIT)
No topic within the field of data visualization has created more heated debate over the years than that of “chart junk.” This is perhaps because, when Edward Tufte first introduced the concept, he did so provocatively, inviting a heated response. Ever since, this debate has not only flourished without signs of cessation, but it has generated some of the least substantive and defensible claims in the field. I’ve contributed to this debate many times, always trying to rein it back into the realm of science. Whenever a research study that appears to defend the usefulness of chart junk is published, the Web immediately comes alive with silly chatter, consisting mostly of chest thumping: “Ha, ha! Take that!” The latest study of this ilk was presented this week at the annual IEEE VisWeek Conference by Michelle Borkin, et. al. (students and faculty at Harvard and MIT), titled “What Makes a Visualization Memorable?” Yeah, you guessed it, apparently it’s chart junk.
When I last attended VisWeek in 2011, my favorite research study was presented by this same researcher, Michelle Borkin. Her study produced a brilliant, life-saving visualization of the coronary arteries that could be used by medical doctors to diagnose plaque build-up that indicates heart disease. It was elegant in its simplicity and clarity. Borkin’s latest study, however, does not resemble her previous work in the least. Here’s the paper’s abstract in full:
An ongoing debate in the Visualization community concerns the role that visualization types play in data understanding. In human cognition, understanding and memorability are intertwined. As a first step towards being able to ask questions about impact and effectiveness, here we ask: “What makes a visualization memorable?” We ran the largest scale visualization study to date using 2,070 single-panel visualizations, categorized with visualization type (e.g., bar chart, line graph, etc.), collected from news media sites, government reports, scientific journals, and infographic sources. Each visualization was annotated with additional attributes, including ratings for data-ink ratios and visual densities. Using Amazon’s Mechanical Turk, we collected memorability scores for hundreds of these visualizations, and discovered that observers are consistent in which visualizations they find memorable and forgettable. We find intuitive results (e.g., attributes like color and the inclusion of a human recognizable object enhance memorability) and less intuitive results (e.g., common graphs are less memorable than unique visualization types). Altogether our findings suggest that quantifying memorability is a general metric of the utility of information, an essential step towards how to design effective visualizations.
The authors collected a large set of data visualizations from the Web. Each visualization was coded by the research team for various characteristics (type of visualization, number of colors, data-ink ratio, the presence of pictograms, etc.) During a test session, subjects were shown one data visualization at a time for one second each, followed by a 1.4 second period of blank screen before the next visualization would appear. Each session displayed approximately 120 visualizations. The test was set up as a game with the objective of clicking whenever a visualization that appeared previously appeared a second time. A particular visualization never appeared more than twice. Hits (the subject indicted correctly that the visualization had appeared previously) and false hits (the subject incorrectly indicated that a visualization had previously appeared when it hadn’t) were both scored, but misses were not. The study’s objective was to determine which of the characteristics that were coded caused visualizations to be most memorable.
Any form of presentation, be it a book, speech, lecture, infographic, news story, or research paper, to name but a few, should be judged on how well it achieves the author’s objectives and the degree to which those objectives are worthwhile. A research paper in particular should be judged by how well it does what the authors claim and how useful its findings are to the field of study. This study does not actually do what it claims. What it actually demonstrates is quite different from the authors’ claims and does not qualify as new information.
The title of this study, “What Makes a Visualization Memorable?,” is misleading. It doesn’t demonstrate what makes a visualization memorable. A more accurate title might be: “When visualizations are presented for one second each in a long series, what visual elements or attributes most enable people to remember that they’ve seen it if it appears a second time?” That’s a mouthful and not a particularly great title, but it accurately describes what the study was actually designed to test. The study did not determine what makes a visualization memorable, but what visual elements or attributes included in the visualization would be noticed when viewed for only a second and then recognized when seen again. A data visualization contains content. Its purpose is to communicate that content. A visualization is not memorable unless its content is memorable. Merely knowing that you saw something a minute or two ago does not contribute in any obvious way to data visualization. And, more fundamentally, remembering something about the design of a visualization is nothing but a distraction. Ultimately, only the content matters; the design should disappear.
When an image appears before your eyes for only a second and then disappears, what actually goes on in your brain perceptually and cognitively? When the image is a visualization, you don’t have time to even begin making sense of it. At best, what happens in that brief moment is that something catches your eye that can be stored as a distinct memory. When the task that is being tested is your ability to recall if you’ve seen the image before when it’s flashed in front of your eyes a second time, then it’s necessary that the memory differentiates the image from the others that are being presented. If a clean and simple bar graph appears, there is nothing unique, no differentiator, from which to form a distinct memory. At best in that single second that you view it the concept “bar graph” forms in your brain, but you’re seeing many bar graphs and nothing about them is being recorded to differentiate them. If you see something with a profusion of colors, that colorful image is imprinted, which can serve as a distinct memory for near-term recall. If you see a novel form of display, a representation of that novelty can be retained. If you see a diagram that forms a distinct shape, it can be temporarily retained. What I’m describing is sometimes called stickiness. Something sticks because something about it stood out as memorable. That something rarely has anything to do with the content of the visualization.
Visualizations cannot be read and understood in a second. Flashing a graph in front of someone’s eyes for a second tells us nothing useful about the graphical communication, with one possible exception: the ability to grab attention. Knowing this can be useful when you are displaying information in a context that requires that you first catch viewers’ eyes to get them to look, such as in a newspaper or on a public-facing website. This potential use of immediate stickiness, however, was not mentioned in the study.
So, when the authors of this study made the following claim, they were mistaken:
Altogether our findings suggest that quantifying memorability is a general metric of the utility of information, an essential step towards determining how to design effective visualizations.
Whether the assertion is true or not, this study did not test it. They went on to say:
Clearly, a more memorable visualization is not necessarily a more comprehensible one. However, knowing what makes a visualization memorable is a step towards answering higher level questions like “What makes a visualization engaging?” or “What makes a visualization effective?”.
Although the first sentence is true, what follows is pure conjecture. The authors seemed to wake up toward the end of the paper when they stated:
We do not want just any part of the visualization to stick (e.g., chart junk), but rather we want the most important relevant aspects of the data or trend the author is trying to convey to stick.
Yes, this statement is absolutely true. Unfortunately, this study does not address this aspect of stickiness at all. Sanity prevailed when they further stated:
We also hope to show in future work that memorability — i.e., treating visualizations as scenes — does not necessarily translate to an understanding of the visualizations themselves. Nor does excessive visual clutter aid comprehension of the actual information in the visualization (and may instead interfere with it).
If they do go on to show this in the future, they will have succeeded in exposing the uselessness of this paper. If only this realization had encouraged them to forego the publication of this study and quickly move on to the next.
If we reframed this study as potentially useful for immediately catching the reader’s eye and that alone, the following findings might have some use:
Not surprisingly, attributes such as color and the inclusion of a human recognizable object enhance memorability. And similar to previous studies we found that visualizations with low data-to-ink ratios and high visual densities (i.e., more chart junk and “clutter”) were more memorable than minimal, “clean” visualizations.
More surprisingly, we found that unique visualization types (pictoral [sic], grid/matrix, trees and networks, and diagrams) had significantly higher memorability scores than common graphs (circles, area, points, bars, and lines). It appears that novel and unexpected visualizations can be better remembered than the visualizations with limited variability that we are exposed to since elementary school.
As I mentioned in the beginning, however, these are not new findings. It’s interesting that finding described in the second paragraph above contradicted the authors’ expectations. They assumed that familiar visualizations, such as bar and line graphs, would be more memorable than novel visualizations. We’ve known for some time that novelty is sticky. The wonderful book by brothers Chip and Dan Heath, Made to Stick, made a big deal of this.
The one part of this study that I found most interesting and informative was a section that wasn’t actually relevant to the study. The authors quantified the number of times particular types of visualization appeared in four particular venues: scientific publications, infographics, all news media, and government and world organization. I found it interesting to note that news media of all types use bar and line graphs extensively, but infographics seldom include them. It was also interesting that tables supposedly appear much more often in infographics than in scientific publications, which doesn’t actually ring true to my experience.
A few other problems with the study are worth mentioning:
- The authors created a new taxonomy for categorizing visualizations that wasn’t actually useful for the task at hand. When revealed for only a second, there is nothing that we could reliably conclude about the comparative memorability the visualization types defined by their taxonomy. Because their taxonomy did not define visualization types as homogenous groups, comparisons made between them are meaningless. For example, grouping all graphs together that show distributions (histograms, box plots, frequency polygons, strip plots, tallies, stem-and-leaf plots, etc.) is not useful for determining the relative memorability of visualization types.
- They described bars (rectangles) and lines (contours) as “not natural,” but diagrams, radial plots, and heat maps as “more natural” and thus more memorable. From the perspective of visual perception, however, few shapes are more natural than rectangles and contours, which represent much of our world.
- I found it interesting that the racial mix of participants in the experiment (41.7% Caucasian, 37.5% South Asian, 4.2% African, 4.2% East Asian, 1.1% Hispanic, and 11.3% other/unreported) was considered by the authors to be “sampled fairly from the Mechanical Turk worker population.” When did Mechanical Turk become the population that matters? Wouldn’t it be more useful to have a fair sample of the general population? A 37.5% proportion of South Asians is not at all representative of the population in the United States in particular or the world in general, nor are 4.2% African and 1.1% Hispanic representative.
I’ve yet to see a useful study about chart junk in the last decade or so. Perhaps there’s something about the controversial nature of the debate and the provocative nature of claims that chart junk is useful (e.g., the possibility of knocking Tufte and Few down a notch or two) that shifts researchers from System 2 thinking (slow and rational) into System 1 (fast and emotional). Despite the flaws in this study, just like the others that have preceded it, dozens of future studies will cite it as credible and people will make outlandish claims based on it, which has already begun in the media.