Thanks for taking the time to read my thoughts about Visual Business
Intelligence. This blog provides me (and others on occasion) with a venue for ideas and opinions
that are either too urgent to wait for a full-blown article or too
limited in length, scope, or development to require the larger venue.
For a selection of articles, white papers, and books, please visit
March 22nd, 2016
Science is the best method that we’ve found for seeking truth. I trust science, but I don’t trust scientists. Science itself demands that we doubt and therefore scrutinize the work of scientists. This is fundamental to the scientific method. Science is too important to allow scientists to turn it into an enterprise that primarily serves the interests of scientists. Many have sounded the alarm in recent years that this tendency exists and must be corrected. BBBC Radio 4 recently aired a two-episode series by science journalist Alok Jha titled “Saving Science from the Scientists.” Jha does an incredible job of exposing some of ways in which science is currently failing us, not because its methods are flawed, but because scientists often fail to follow them.
This system can’t just rely on trust. Transparency and openness have to be implicit. In speaking with scientists it became clear to me that the culture and incentives within the modern scientific world itself are pushing bad behavior.
We all have a stake in this. Science has and will continue to form a big part in modern life, but we seem to have given scientists a free pass in society. Perhaps it’s time to knock scientists off their pedestal, bring them down to our level, and really scrutinize what they’re up to. Let’s acknowledge and account for the humans in science. It will be good for them and it will be good for us.
Marc Edwards, the Virginia Tech professor who exposed the high levels of lead in the water of Flint, Michigan, expresses grave concerns about our modern scientific enterprise. Bear in mind that the toxins that he discovered and exposed had been denied by government scientists. Here’s a bit of Jha’s interview with him:
My fear is that someday science will become like professional cycling, where, if you don’t cheat you can’t compete…The beans that are being counted for success have almost nothing to do with quality. It has to do with getting your number of papers, getting your research funding, inflating your h-index, and frankly, there are games that people play to make these things happen.
The h-index is a ranking system for scientists that is based on the number of publications and citations by others of those publications. Science is a career. To advance, you must publish and be cited. This perverts the natural incentives of science from a pursuit of knowledge to a pursuit of professional advancement and security.
Even the much praised process of peer review is often dysfunctional. Reviewers are often unqualified. Even more of a problem, however, is the fact that they are busy and therefore take little time in their reviews, glossing over the surface of studies that cannot be understood without greater time and thought. How can we address problems in the peer review process? Jha suggests a few thoughts on the matter.
There is a way to tackle these issues, and that’s by opening up more of the scientific process to outside scrutiny. Peer review reports could be published alongside the research papers. Even more importantly, scientists could be releasing their raw data too. It’s an approach that’s already revolutionized the quality of work in one field.
The field that he was referring to in the final sentence was genetics. There was a time when the peer review process in genetics was severely flawed, but steps were taken to put this right.
Dysfunction in the scientific process varies in degree among disciplines. Some are more mature in their efforts to enforce good practices than others. Some, such as infovis research, have barely begun the process of implementing the practices that are needed to promote good science. It is not encouraging, however, that this fledgling field of research has already erected the protections against scrutiny that we have come to expect only from long-term and entrenched institutionalization. The response that I’ve received from officials in the IEEE InfoVis community in response to my extensive and thoughtful critiques of its published studies are in direct conflict with the openness that those leaders should be encouraging. When they deny that problems exist or insist that they are addressing them successfully behind closed doors, I can’t help but think of the Vatican’s response for many years to the problem of child molestation. No, I am neither comparing the gravity of bad research to child molestation nor am I comparing researchers to malign priests, but am instead comparing the absurd protectionism of the infovis research community’s leaders to that of Catholic leadership. Systemic problems do exist in the infovis research community and they are definitely not being acknowledged and addressed successfully. Just as in other scientific disciplines, infovis researchers are trapped in a dysfunctional system of their own making, yet they defend and maintain it rather than correcting it for fear of recrimination. They’re concerned that to speak up would result in professional suicide. By remaining silent, however, they are guaranteeing the mediocrity of their profession.
Jha sums up his news story with the following frank reminder:
There’s nothing better than science in helping us to see further, and it’s therefore too important to allow it to become just another exercise in chasing interests instead of truths…We need to save scientific research from the business it’s become, and perhaps we need to remind scientists that it’s us, the public, that gives them the license to do their work, and its us to whom they owe their primary allegiance.
I’m not interested in revoking anyone’s license to practice science; I just want to jolt them into remembering what science is, which is much more than a career.
March 1st, 2016
Modern science relies heavily on an approach to the assessment of uncertainty that is too narrow. Scientists rely on statistical measures of significance to establish the merits of their findings, often without fully understanding the limitations of those statistics and the original intentions for their use. P-values and even confidence intervals are cited as stamps of approval for studies that are meaningless and of no real value. Researchers strive to reach significance thresholds as if that were the goal, rather than the addition of useful knowledge. In his book Willful Ignorance: The Mismeasure of Uncertainty, Herbert I. Weisberg, PhD, describes this impediment to science and suggests solutions.
This book is for researchers who are dissatisfied with the way that probability theory is being applied to science, especially those who work in the social sciences. Weisberg describes the situation as follows:
To achieve an illusory pseudo-certainty, we dutifully perform the ritual of computing a significance level or confidence interval, having forgotten the original purposes and assumptions underlying such techniques. This “technology” for interpreting evidence and generating conclusions has come to replace expert judgment to a large extent. Scientists no longer trust their own intuition and judgment enough to risk modest failure in the quest for great success. As a result, we are raising a generation of young researchers who are highly adept technically but have, in many cases, forgotten how to think for themselves.
In science, we strive for greater certainty. Probability is a measure of certainty. But what do we mean by certainty? What we experience as uncertainty arises from two distinct sources: doubt and ambiguity. “Probability in our modern mathematical sense is concerned exclusively with the doubt component of uncertainty.” We measure it quantitatively along a scale from 0 for complete uncertainty to 1 for complete certainty. Statistical measures of probability do not address ambiguity. “Ambiguity pertains generally to the clarity with which the situation of interest is being conceptualized.” Ambiguity—a state of confusion, of simply not knowing—does not lend itself as well as doubt to quantitative measure. It is essentially qualitative. When we design scientific studies, we usually strive to decrease ambiguity through various controls (selecting a homogeneous group, randomizing samples, limiting the number of variables, etc.), but this form of reductionism distances the objects of study from the real world in which they operate. Efforts to decrease ambiguity require judgments, which require expertise regarding the object of study that scientists often lack.
A chasm exists in modern science between researchers, who focus on quantitative measures of doubt and practitioners who rely on qualitative judgments to do their work. This is clearly seen in the world of medicine, with research scientists on one hand and clinicians on the other. “We have become so reliant on our probability-based technology that we have failed to develop methods for validation that can inform us about what really works and, equally important, why.” Uncertainty reduction in science requires a collaboration between these artificially disconnected perspectives.
Our current methodological orthodoxy plays a major role in deepening the division between scientific researchers and clinical practitioners. Prior to the Industrial Age, research and practice were more closely tied together. Scientific investigation was generally motivated more directly by practical problems and conducted by individuals involved in solving them. As scientific research became more specialized and professionalized, the perspectives of researchers and clinicians began to diverge. In particular, their respective relationships to data and knowledge have become quite different.
As I’ve said through various critiques of research studies and discussions with researchers, this chasm between researchers and expert practitioners is especially wide in the field of information visualization and seems to be getting wider.
To make his case, Weisberg takes his readers through the development of probability theory from its beginnings. He does this in great detail, so be forewarned that this assumes an interest in the history of probability. In fact, this history is quite interesting, but it does make up the bulk of the book. It is necessary, however, to help the reader understand the somewhat arbitrary way in which statistical probability was conceptualized in the context of games of chance, as well as the limitations of that particular framing. Within this conceptual perspective, specific statistics such as correlation coefficients and P-values were developed for specific purposes that should be understood.
In the conduct of scientific research, we have the choice of a half-empty or half-full perspective. We must judge whether we really do understand what is going on to some useful extent, or must defer to quantitative empirical evidence. Statistical reasoning seems completely objective, but can blind us to nuances and subtleties. In the past, the problem was to teach people, especially scientists and clinicians, to apply critical skepticism to their intuitions and judgments. Thinking statistically has been an essential corrective to widespread naiveté and quackery. However, in many fields of endeavor, the volumes of potentially relevant data are growing exponentially…Unfortunately, the capacities for critical judgment and deep insight we need may be starting to atrophy, just as opportunities to apply them more productively are increasing.
Don’t assume that Weisberg wants to dismantle the mechanisms of modern science. Instead, he wants to augment them to advance knowledge more effectively.
Is there a way to avoid the regression of science? The answer is surprisingly simple, in principle. We must recognize that probability theory alone is insufficient to establish scientific validity. There is only one foolproof way to learn whether an observed finding, however statistically significant it may appear, might actually hold up in practice. We must dust off the time-honored principle of replication as the touchstone of validity…Only when the system demands and rewards independent replications of study findings can and should public confidence in the integrity of the scientific enterprise be restored.
In addition to study replication, Weisberg also strongly advocates a merging of the perspectives and skills of researchers and practitioners.
Theoretical knowledge and insight can often be helpful in focusing attention or promoting attention on a promising subject of variables. Understanding causal processes will often improve the chances of success, and of identifying factors that are interpretable by clinicians. Clinical insight applied to individual cases will depend on understanding causal mechanisms, not just blind acceptance of black-box statistical models.
Weisberg goes on to suggest ways in which current computer technologies and rapidly expanding data collections create new opportunities for the conduct of science, in many respects similar to Ben Shneiderman’s vision of Science 2.0. Opportunities abound, but they will remain untapped if we fail to correct glaring flaws in our current approach to scientific research. Weisberg knows that this won’t be easy, but he exhibits a balance between concern for systemic dysfunction and optimism for progress. Even more, he offers specific suggestions for setting this progress in motion.
This is a marvelous book—well-written and the product of exceptional thinking. If the role of statistics in research does not interest or concern you, don’t buy this book, for you won’t stick with it. If you share my concerns, however, that science must be renovated and augmented to address the challenges of today and that our understanding and use of probability theory is central to this effort, this book is worth your time.
February 24th, 2016
A few days ago, I received an email from a professor who teaches information visualization about a recent research study titled “Hypothetical Outcome Plots Outperform Error Bars and Violin Plots for Inferences About Reliability of Variable Ordering.” The study, done by Jessica Hullman, Paul Resnick, and Eytan Adar, was published by the journal PLOS ONE on November 15, 2015. The professor asked if I was familiar with it, and, if so, what I thought of it. I wasn’t aware of it, but that soon changed. This study is nonsense—another representation of dysfunction within the infovis research community. Like many infovis researchers, the authors appear to be naive about the ways that people use information visualization in the real world and what actually works for the human brain. In this blog post I’ll highlight the study’s problems and describe the conditions that, I suspect, gave rise to them.
Hypothetical Outcome Plots (HOPs) were created by the authors of this study to display one or more sets of quantitative values so that people can see how those values are distributed and, when multiple sets are displayed, compare the distributions. HOPs do this, not as a static display, such as a histogram or box plot, but as an animated series of values that appear sequentially, 400ms per frame. The following example shows a single HOPs frame (i.e., one of many values).
When animated, the blue line, which represents a single value, changes position to display several values in a data set, one at a time. In the figure below, the animated HOPs on the right represents the same two normal distributions that are displayed on the left with blue lines to mark the means and error bars to represent 95% confidence intervals.
The authors make the following claim about the merits of their study: “Our primary contribution is to provide empirical evidence that untrained users can interpret and benefit from animated HOPs.” When I came across this claim early in the paper, I could not imagine how HOPs could ever serve as a viable substitute for graphs that summarize distributions, or, how untrained users would find them more informative than a simple descriptive sentence. What was I missing that allowed the authors to make this claim? Upon further review, I discovered that the authors devised experiments that 1) restricted the usefulness of the static distribution graphs that they pitted against HOPs, and 2) asked subjects to perform useless tasks that were customized to match the abilities of HOPs. Simply put, the authors stacked the deck in favor of HOPs, yet were still unable to back their claims.
Before diving into the study, let’s remind ourselves of what data visualization is. Here’s a definition that I’ve been presenting recently in lectures:
Data visualization is technology-augmented visual thinking and communication about quantitative data.
Data visualization involves human-computer collaboration. We use visual perception and cognition to do what they do well and we allow computers to assist us by performing tasks that they can do better than we can. Not every task is performed by the human visual system. Only those tasks that the visual system can handle better than cognition or a computer are performed in this way. Skilled data visualizers distribute the work of data sensemaking appropriately between perception, cognition, and the computer in ways that leverage the strengths and avoid the weaknesses of each.
Now, back to the study. The authors were inspired by the fact that people think about proportions more naturally in terms of frequencies (counts) rather than percentages. For example, those who have not learned to think comfortably in terms of percentages often find it easier to understand the expression “57 out of 100 people” than the equivalent expression “57% of the people.” With this in mind, it apparently occurred to the authors that, if they represented distributions as a randomly selected set of 100 values and presented those values one at a time as an animation, people could potentially engage in counting to examine and compare distributions.
Let’s think about the characteristics that describe distributions and therefore typically need to be represented by distribution displays. In general, we describe the nature of a distribution in terms of the following three characteristics:
- Spread (the range across which the values are distributed)
- Central Tendency (a measure of the distribution’s center)
- Shape (the pattern that is formed by the set of values when arranged from lowest to highest)
Each of these characteristics answers specific questions that we typically ask about distributions. For example, the central tendency answers such questions as, “What value is most typical?” and “What single value is most representative of the set as a whole?” The shape answers questions such as, “Where are most of the values located?” and “Is the distribution normal, skewed, uniform, bimodal, etc.?” Several graphs have been developed to display these characteristics of distributions, including histograms, frequency polygons, strip plots, quantile plots, violin plots, box plots, and Q-Q plots. They vary in the characteristics that they feature and therefore in the purposes for which they are used. None of the displays that have been developed for this purpose rely on counting. Anyone who needs to examine and compare distributions can easily learn to use these graphs. I know this, because I teach people to use them.
But what about those occasions when we need to explain something about one or more distributions to a lay audience? If that information is best described visually, we use a simple distribution graph and explain how to read it. If that information can be communicated just as well in words and numbers, we take that approach. For a lay audience, anything that HOPs could possibly display could be more clearly presented in a simple sentence. Counting values as they watch an animated display is never a viable solution.
In this study, only one of the tasks that subjects were asked to perform was typical of the questions that we ask when examining distributions. The others were contrived to rely on counting to suggest a use of HOPs. Even if counting could answer some questions that we might ask about distributions, should that lead us to invent a form of data visualization to support the task of counting? Think about it. Is it humans or computers that excel at counting? Clearly, it isn’t humans. We count slowly and are prone to error, but counting is a task that computers were specifically designed to do at great speed and accuracy. What I’m pointing out is that HOPs are an attempt to use the human visual and cognitive systems to do something that is handled far better by a computer. The authors’ attempt to create a counting visualization was fundamentally misguided.
Let’s review the study to see how the authors fallaciously ascribed benefits to an ineffective form of display.
The Study’s Design
The study tested the ability of subjects to perform various tasks while exclusively examining normal distributions using one of the three following displays:
- A short horizontal blue line to mark the mean and error bars to show 95% of the spread around the mean.
- A violin plot in which the widest blue area marks the mean, the top and bottom show the spread, and the varying width of the blue area shows the shape.
- HOPs, in which the lowest and highest positions where values appear during the animation suggest the spread, the position in the middle of that range suggests the mean, and the frequency of values appearing in particular ranges suggests the shape.
Each data set was created by randomly selecting 5,000 values from a larger, normally distributed data set. When displayed as HOPs, however, not all of the values were included. Instead, a random sample of approximately 100 values was selected, varying somewhat from task to task, with a median of 89 values and a mean of 101. These values were then displayed individually, in random order, as an animation. Subjects were given the ability to pause the sequence and to manually advance it forward or backward, frame by frame. The frames were numbered and the numbers were visible to subjects so they could tell where they were in the series at any time, along with the total number of frames.
The study was divided into three major sections based on the number of distributions that were shown: 1) four tasks involving one-distribution displays, 2) four tasks involving two-distribution displays, and 3) one task involving a three-distribution display. Sample displays for each of these three sections are illustrated below using violin plots.
The tasks that subjects were asked to perform differed in each of the three sections. Let’s examine each section in turn.
While viewing each single-distribution display, subjects were asked to perform three tasks: 1) identify the mean, 2) estimate what proportion of values were located above a particular point, and 3) estimate what proportion of values fell within a specified range (always multiples of 10, such as from 20 to 50).
As you can probably imagine, when asked to identify the mean, subjects could easily do this when viewing the error bar and violin plot displays. With HOPs, subjects performed less well when the values were distributed across a large spread, as you would expect, but almost as well when the values were distributed across a small spread. This makes sense. In HOPs, when the line that marks the value hops around within a narrow region, it is easy to estimate a position near the center of that region. This was the only task that subjects were asked to perform that was typical of actual tasks that are done with distributions in the real world and wasn’t devised to match the abilities of HOPs.
When subjects were asked to estimate the proportion of values that fell above a particular point or within a particular range, they performed better when using HOPs, which isn’t surprising when you consider the study’s design. The number of values that were shown using HOPs was always relatively close to 100. HOPs supported these specific tasks by inviting subjects to count the number of times that the line appeared above the specified threshold or within the specified range. On the other hand, the error bar display provided no information about the shape of the distribution, so it could not support this task at all. The violin plot provided information about the shape of the distribution, but it is difficult to estimate the percentage of values that fall above a particular position or within a particular range based on the varying width of the blue shaded area. These are not tasks that we would perform using violin plots. Using HOPs to perform these tasks took some time and effort, but it provided the easiest way to answer the questions. It would be ludicrous to conclude from this, however, that HOPs would ever provide the best way to examine distributions. If we ever needed to perform these particular tasks, we would use a different form of display. For example, a histogram with binned intervals of 10 (0-9, 10-19, etc.) would make it easy to determine the proportion of values in a specified range. Nevertheless, this isn’t a task that we would ordinarily rely on our visual system to handle, but would rely on the computer to respond to a specific query. Queries, which can be be generated in various ways, provide precise and efficient answers. For example, “What percentage of values fall above the threshold of 76?” Expressed in computer terms, we would request a count of the rows where the value of some measure is greater than 76. Virtually all analytical tools support queries such as this, and good visualization tools allow graphs to be brushed to select particular ranges of values to retrieve the precise number or percentage of values associated with those ranges. In light of these efficient and precise options, who would choose to count items while watching a HOPs animation?
In this section of the study, the means of two independent distributions, A and B, were deliberately differed such that B was greater in value on average than A. Subjects were asked to compare the distributions. Typically, when comparing two independent distributions, we would ask questions such as:
- “On average, which is greater, A or B, and by how much?”
- “Which exhibits greater variation, A or B?”
- “How do distributions A and B differ in shape?”
Instead of a question along these lines, however, subjects were asked to determine “how often” B was larger than A out of 100? This is a strange question. Imagine looking at the violin plot below and being asked to determine how often values of B are larger than A.
The question doesn’t make sense, does it? The closest question that makes sense is probably “On average, how much greater are values of B than A?” With two normal distributions, this could be determined by comparing their means. The strange question that subjects were asked was clearly designed to demonstrate a use of HOPs. Subjects were directed to count the number of times, frame by frame, that the value of B was higher than A. Those subjects who viewed HOPs, rather than the error bar or violin plot displays, supposedly succeeded in demonstrating the benefits of HOPs if they could count. But what did the HOPs display actually tell them about the two distributions? Remember, the authors are proposing HOPs as a useful form of display for people who don’t understand how to read conventional distribution displays. Imagine that, by counting, the untrained viewer determines that B is greater than A 62 out of 100 times. Does this mean that B is 62% greater than A? It does not. What that is meaningful has the viewer learned about the two distributions by viewing HOPs? Unfortunately, the authors don’t tell us.
What if, rather than comparing two datasets as separate distributions, we want to compare pairs of values to see how they relate to one another? For instance, imagine that we want to see how women’s salaries relate to their male spouse’s salaries to see if one tends to be higher than the other. We can’t see how two sets of values are paired using error bars or violin plots. Are HOPs appropriate for this? They are not. If we wish to examine relationships between two sets of paired values, we’ve moved from distribution analysis to correlation analysis, so we need different types of graphs, such as scatterplots. Watching HOPs animations to examine correlations would never match the usefulness of a simple scatterplot.
In this final section of the study, consisting of distributions A, B, and C, subjects were asked to determine “how often” B was larger than both A and C? Fundamentally, this is the same task as the one in the two-distribution displays section, only complicated a bit by the addition of a third variable. Even if it were appropriate to compare three independent distributions by randomly selecting a sample of 100 values from each, arbitrarily arranging them in groups of three—one value per variable—and counting the number of times B was greater than A and C, this is not a task that we would typically perform by viewing a visualization of any type. Instead, if we were examining the data ourselves, we would simply query the data to determine the number or percentage of instances in which B > A and B > C. Watching a time-consuming animation would be absurd. Or, if we were reporting our findings to untrained users, we would do so with a simple sentence, such as “B is greater than both A and C in 32 out of 100 instances.”
The flaws in this study should be obvious to anyone with expertise in data visualization, so how is it that this study was performed by academics who specialize in infovis research and how did it pass the peer review process, resulting in publication? In part, I think people are inclined to embrace this study because it exhibits two qualities that are attractive to the infovis research community: 1) it proposes something new (innovation is valued above effectiveness by many in the field), and 2) it features animation, which is fun. Who can resist the natural appeal of “dancing data?” Those of us who rely on data visualization to do real work in the world, however, don’t find it difficult to resist inappropriate animations. Those who approach data visualization merely as the subject matter of research publications that will earn them notoriety and tenure are more susceptible to silly, ineffective visualizations. As long as the research community embraces this nonsense, it will remain of little value to the world. If you’re involved in the research community, this should concern you.
How am I able to find flaws in studies like this when researchers, including professors, miss them? It isn’t because I’m smarter than the average researcher. What sets me apart is my perspective. Unlike most researchers, I’m deeply involved in the actual use of data visualization and have been for many years. Because I work closely with others who use data visualization to solve real-world problems, I’m also painfully aware of the cost—the downright harm—of doing it poorly. These perspectives are foreign to many in the infovis research community. You cannot do good infovis research without first developing some expertise in the actual use of data visualization. This should be obvious, without needing to be said, but sadly, it is not.
P.S. I realize that this critique will likely ignite another shit storm of angry responses from the infovis research community. I will be accused of excessive harshness. Rather than responding to the substance of my critique, many will focus on my tone. To the degree that my critiques are sometimes harsh in tone, rest assured that I’ve crafted a tone that I believe is appropriate and necessary. I’m attempting to cut through the complacency of the infovis research community. If you believe that there is a kinder, gentler way to bring poor and potentially harmful research to light, I invite you to make the attempt. If your approach succeeds where mine fails, I will embrace you with gratitude. The best way to address the problem of poor research, of course, is to nip it in the bud before it is published, but that clearly isn’t happening.
February 23rd, 2016
Tomorrow, February 24, PBS will air a new documentary titled “The Human Face of Big Data.” I’m a longtime supporter of PBS, but on occasion they get it wrong (e.g., when they provide a showcase for charlatans such as Deepak Chopra). In this new documentary, they apparently get much of it wrong by putting a happy face on Big Data that ignores the confusion, false claims, and all but one of the risks that it promotes. The tech journalist Gil Press has skillfully revealed the documentary’s flaws in a review for Forbes titled “A New Documentary Reveals a One-Dimensional Face of Big Data.” Gil and I had a chance to get acquainted a few years ago and I’ve come to appreciate the voice of sanity that he often raises in response to technological hype and misinformation. I strongly recommend that you read Gil’s review to restore balance to the force.
January 19th, 2016
In a previous blog post titled “Potential Information Visualization Research Projects,” I announced that I would prepare a list of potential research projects that would address actual problems and needs that are faced by data visualization practitioners. So far I’ve prepared an initial 33-project list to seed an ongoing effort, which I’ll do my best to maintain as new ideas emerge and old ideas are actually addressed by researchers. These projects do not appear in any particular order. My intention is to help practitioners by making researchers aware of ways that they can address real needs. I will keep a regularly updated list of project ideas as a PDF document, but I’ve briefly described the initial list below. The list is currently divided into three sections: 1) Effectiveness and Efficiency Tests, 2) New Solution Designs and Tests, and 3) Taxonomies and Guidelines.
Some of the projects that appear in the Effectiveness and Efficiency Tests section have been the subject matter of past projects. For example, several projects in the past have tested the effectiveness of pie charts versus bar graphs for displaying parts of a whole. In these cases I feel that the research isn’t complete. Apparently, some people feel that the jury is still out on the matter of pie charts versus bar graphs, so it would be useful for new research to more thoroughly establish, more comprehensibly address, or perhaps challenge existing knowledge.
Please feel free to respond to this blog post or to me directly at any time with suggestions for additional research projects or with information about any projects on this list that are actually in process or already completed.
Effectiveness and Efficiency Tests
- Determine the effects of non-square aspect ratios on the perception of correlation in scatterplots.
- Determine the effectiveness of bar graphs compared to dot plots when the quantitative scale starts at zero.
- Determine the relative speed and effectiveness of interpreting data when presented in typical dashboard gauges versus bullet graphs (one of my inventions).
- Determine the effectiveness of wrapped graphs (one of my inventions) compared to treemaps when the number of values does not exceed what a wrapped graphs display can handle.
- Determine the effectiveness of bricks (one of my inventions) as an alternative to bubbles in a geo-spatial display.
- Determine the effectiveness of bandlines (one of my inventions) as a way of rapidly seeing magnitude differences among a series of sparklines that do not share a common quantitative scale.
- Determine if donut charts are ever the most effective way to display any data for any purpose.
- Determine if pie charts are ever the most effective way to display any data for any purpose.
- Determine if radar charts are ever the most effective way to display any data for any purpose.
- Determine if mosaic charts are ever the most effective way to display any data for any purpose.
- Determine if packed bubble charts are ever the most effective way to display any data for any purpose.
- Determine if dual-scaled graphs are ever the most effective way to display any data for any purpose.
- Determine if graphs with 3-D effects (e.g., 3-D bars) are ever the most effective way to display any data for any purpose.
- Determine which is more effective: displaying deviations in relation to zero or 100%. For example, if you wish to display the degree to which actual expenses varied in relation to the expense budget, would it work best to represent variances as positive or negative percentages above or below zero or as percentages less than or greater than 100%.
- Determine the effectiveness of various designs for Sankey diagrams in an effort to recommend design guidelines.
- Determine the best uses of various network diagram layouts (centralized burst, arc diagrams, radial convergence, etc.).
- Determine the effectiveness of word clouds versus horizontal bar graphs (or wrapped graphs).
- Determine which shapes are most perceptible and distinguishable for data points in scatterplots.
- Determine the effectiveness of large data visualization walls versus smaller, individual workstations.
- Determine if the effectiveness of displaying time horizontally from left to right depends on one’s written language or is more fundamentally built into the human brain.
- Determine if the typical screen scanning pattern beginning at the upper left depends on one’s written language or is more fundamentally built into the human brain.
- Determine the relative speed and effectiveness of interpreting particular patterns in data when displayed as numbers in tables or visually in graphs. For example, compare a table that displays 12 monthly values per row versus a line graph that displays the same values (i.e., twelve monthly values per line) to see how quickly and effectively people can interpret various patterns such as trending upwards, trending downwards, particular cyclical patterns, etc. We know that it is extremely difficult to perceive patterns in tables of numbers, but it would be useful to actually quantify this performance.
- Determine the relative speed of finding outliers in tables of numbers versus graphs.
- Determine the relative benefits of using a familiar form of display versus one that requires a few seconds of instruction. The argument is sometimes made that a graph must be instantly intuitive because making people learn how to read an unfamiliar form of display is too costly in time and cognitive effort. For example, population pyramids provide a familiar way for people who routinely compare the age distributions of males versus females in a group, yet a frequency polygon, although unfamiliar, might provide a way to see how the distributions differ much more quickly and easily. In cases when people can be taught to read an unfamiliar forms of display with little effort, does it make sense to do so rather than continuing to use a form of display that works less effectively.
New Solution Designs and Tests
- Develop an effective way to show proportional highlighting, as it pertains in brushing and linking, for portions of the following graphical objects: bars, lines, and boxplots. Various ways to show proportional highlighting have been applied to bar graphs, but not to line graphs and box plots.
- Develop a way to automatically attach data labels to the ends of lines in a line graph without overlapping.
- Develop a way to temporarily overlay or replace box plots with frequency polygons.
- Develop a way to automatically detect the amount of lag between two time series and then align the leading events with the lagging events in a line graph.
- Develop potential uses of blindsight to direct a person’s attention to particular sections of a display as needed (e.g., to something on a dashboard that needs attention).
- Develop a effective design for waterfall graphs when multiple transactions occur in the same interval of time and some are positive and some are negative.
- Develop an algorithm for automatically distributing several sets of time series values uniformly across a 100% scale when they have different starting points, ending points, and durations. For example, this would make it easy to compare the person hours associated with various projects across their lifespans, even when they differ in starting dates, ending dates, and durations.
- Develop a full set of interface mechanisms for making formatting changes to charts (turning grid lines on and off, changing the colors of objects, repositioning and orienting objects such as legends, changing the quantitative scale along an axis, etc.) that involves direct access to those objects rather than one that requires the user to wade through lists of formatting commands located elsewhere (e.g., in dialog boxes).
Taxonomies and Guidelines
- Develop a useful taxonomy or set of guidelines to help people think about the differences in how data visualizations should be designed to support data sensemaking (i.e., data exploration and analysis) versus data communication (i.e., presentation).