A new book about information graphics was published last month titled The Wall Street Journal Guide to Information Graphics, by Dona M. Wong, the graphics director for this respected newspaper. I get excited whenever a new book about data visualization is published, especially one that teaches practical techniques, because too few of us are working in this field. This new addition to my library has its merits, but unfortunately it has its problems as well.
To begin, this book is not what its advertising claims it to be. Rather than “the definitive guide to the graphic presentation of information” and “an invaluable reference work for students and professionals in all fields,” which the dust cover claims, it would be more accurately described as a graphical style guide for financial journalism. I suspect that the content of this book was in fact written by Wong originally as the graphics style guide that is used internally at The Wall Street Journal, and that the newspaper envisioned a new source of revenue by revising it slightly and publishing it as a book. There’s certainly nothing wrong with that, but they should have more clearly described its scope as restricted primarily to the interests of financial journalism.
The quality of this book that will no doubt appeal to many potential readers is, in my opinion, its fundamental failure: it includes relatively few words. Unlike her mentor, Edward Tufte, who uses words liberally and eloquently, Wong’s style of writing is closer to the bullet point approach that Tufte disdains. In this respect, it is different from my books, which have at times been criticized for having too many words. A few readers have remarked that I don’t follow my own principle of simplicity in my books because I use too many words to present the material. What they don’t appreciate is the important difference between simplicity and over-simplification. I provide the context that people need to understand what I teach. When you tell people what they should and shouldn’t do without explaining why, they can at best learn only superficially. To learn deeply, people must understand things at a conceptual level-why things work as they do. This requires more than a few words. Wong’s book has too few. In total, the book includes 120 pages of actual content, which consists mostly of figures. The fact that so many figures exist is not the problem; it is in failing to explain her recommendations that she errs. She says “Do this and don’t do that,” but rarely helps her readers understand why. One problem with this is that Wong isn’t always right, but people who are learning about information graphics for the first time won’t realize this.
Wong states a few rules that entirely miss the mark, but more often she emphatically states what are at best rules of thumb, which must allow many exceptions. While reading the book, I found myself frequently writing comments in the margins such as “it all depends” and even “not true.” To give you a sense of this, here are a few excerpts from the book, followed by my margin comments:
|Wong’s Words||My Margin Comments|
|“Do not plot more than four lines on a simple [line] chart.” (p. 54)||Rule of thumb with many exceptions. Depending on the nature of the data (for example, how close the lines are in value and how much variability in values exists along the lines), a graph could contain many more than four lines and still work quite well. Also, when line graphs are used, not for comparing individual lines, but to provide an overview in a way that features exceptions and predominant patterns, far more than four lines can be included.|
|“Don’t use different colors or colors on the opposite side of the color wheel in a multiple-bar chart.” (p. 40)||It depends. Different hues work best for differentiating items, which is what’s usually needed in line graphs with multiple lines, bar graphs with multiple sets of bars, and so on.|
|“Choose the y-axis scale so that the height of the fever line occupies roughly two-thirds of the chart area.” (p. 51)||Ineffective rule. I think what Wong’s trying to do is bank the line to 45° so it’s not so flat that the trend and pattern can’t be seen, but this approach won’t guarantee this result. Setting the y-axis scale to begin just a little below the lowest value and end just a little above the highest value makes better use of the plot area. Once this is done, the aspect ratio of the graph (the ratio of its width to its height) can be adjusted to prevent the slope of the line from being either too shallow or too steep.|
|“A segmented bar chart in general is more effective than a pie chart at showing proportions of a whole.” (p. 79)||Not true. Actually, for showing a single part-to-whole relationship, a segmented (a.k.a., stacked) bar is never more effective than a pie chart, and in my opinion, neither works as well as a standard bar graph.|
|“Always label the value of a vertical bar if it is close to zero.”||It depends on how the graph is used. Labeling these values is only useful when people need precise values, and why would this rule apply to vertical bars and not to horizontal bars?|
|When it is appropriate to use different color intensities to differentiate series of bars in a bar graph, Wong states: “The shading of the bars should move from the lightest to the darkest for easy comparison.” (p. 67)||Huh? Why not ever from the darkest to the lightest bars?|
|“When plotting horizontal bars over time, the bars should be ordered from the most recent data point [at the bottom] and go back in time [proceeding upward].” (p. 71)||Don’t do this. I recommend that horizontal bars never be used for time-series data, because it is much more natural for people to think of time as proceeding horizontally from left to right.|
This is just a sample of the problems that I noted. Another point on which Wong and I definitely disagree has to do with her recommendations for making the quantitative scales of multiple line graphs different in an effort to make them more comparable, which she addresses in four different sections of the book. In one instance, she wants to make sure that people don’t miss the fact that the following two stocks increased at much different rates, which might occur if they were shown the following graph.
Her solution is to show the following graph instead.
Although I share Wong’s concern, her solution is misleading. To feature the differences in percentage change, the same percentage scale could be used for both graphs, as shown below.
The best solution, however, unless the differences in the magnitudes of change really don’t matter, would be to tell a richer story by presenting the following collection of graphs.
Given the fact that Wong studied under Tufte’s supervision at Yale, I expected to find little with which I would disagree. I was surprised to discover otherwise. Despite our disagreements, I agree with most of Wong’s suggestions, but in almost all such cases she restates what I and others have said before. If you’re already an expert in data visualization, you’ll learn little from this book, except a few techniques that are specific to financial journalism. If you’re a novice hoping to learn the fundamentals of information graphics, be warned that this book advocates a few bad practices along with the good, and it rarely explains the concepts that you must understand to produce effective graphs on your own.