What Is Data Visualization?

Since I founded Perceptual Edge in 2003, data visualization has transitioned from an obscure area of interest to a popular field of endeavor. As with many fields that experience rapid growth, the meaning and practice of data visualization have become muddled. Everyone has their own idea of its purpose and how it should be done. For me, data visualization has remained fairly clear and consistent in meaning and purpose. Here’s a simple definition:

Data visualization is a collection of methods that use visual representations to explore, make sense of, and communicate quantitative data.

You might bristle at the fact that this definition narrows the scope of data visualization to quantitative data. It is certainly true that non-quantitative data may be visualized, but charts, diagrams, and illustrations of this type are not typically categorized as data visualizations. For example, neither a flow chart, nor an organization chart, nor an ER (entity relationship) diagram qualifies as a data visualization unless it includes quantitative information.

The immediate purpose of data visualization is to improve understanding. When data visualization is done in ways that do not improve understanding, it is done poorly. The ultimate purpose of data visualization, beyond understanding, is to enable better decisions and actions.

Understanding the meaning and purpose of data visualization isn’t difficult, but doing the work well requires skill, augmented by good technologies. Data visualization is primarily enabled by skills—the human part of the equation—and these skills are augmented by technologies. The human component is primary, but sadly it receives much less attention than the technological component. For this reason data visualization is usually done poorly. The path to effective data visualization begins with the development of relevant skills through learning and a great deal of practice. Tools are used during this process; they do not drive it.

Data visualization technologies only work when they are designed by people who understand how humans interact with data to make sense of it. This requires an understanding of human perception and cognition. It also requires an understanding of what we humans need from data. Interacting with data is not useful unless it leads to an understanding of things that matter. Few data visualization technology vendors have provided tools that work effectively because their knowledge of the domain is superficial and often erroneous. You can only design good data visualization tools if you’ve engaged in the practice of data visualization yourself at an expert level. Poor tools exist, in part, because vendors care primarily about sales, and most consumers of data visualization products lack the skills that are needed to differentiate useful from useless tools, so they clamor for silly, dysfunctional features. Vendors justify the development of dumb tools by arguing that it is their job to give consumers what they want. I understand their responsibility differently. As parents, we don’t give our children what they want when it conflicts with what they need. Vendors should be good providers.

Data visualization can contribute a great deal to the world, but only if it is done well. We’ll get there eventually. We’ll get there faster if we have a clear understanding of what data visualization is and what it’s for.

Take care,

Signature

20 Comments on “What Is Data Visualization?”


By Chris Gerrard. May 4th, 2017 at 12:23 pm

I agree with your basic thesis, and have a couple of questions.

Does your definition accommodate the use of this presentation of the quantity ‘dozen’: 12
How about 1,276?
I believe it does. The decimal system was created in order to provide a compact, precise system of encoding quantities. It may be seem too obvious to mention, or an uncommon way of thinking, or but limiting data visualization to geometric forms seems to be a common position.

What about the presence of non-numeric categorical informational elements that provide the context for quantitative visual forms?
A bar chart of Sales per Department is likely to be of very little value without the presence of the individual Departments’ names (some other identifier) labeling the bars, or is absent a notice that it’s about Sales.

I’m sympathetic to and agree with your position that “data visualization” has become muddied, often to the point of uselessness, but I can’t help thinking that data visualization also includes those labels/things that provide the identity and context for the quantitative bits.

By Stephen Few. May 4th, 2017 at 12:59 pm

Chris,

The fact that data visualizations include categorical labels, in my mind, goes without saying. Displaying quantities without identifying what they represent is meaningless.

If your questions about 12 and 1,276 are asking if numbers qualify as data visualization when they are presented textually (i.e., as alphanumeric characters) rather than graphically (i.e., as geometrical objects), my answer is “No.” Expressing numbers textually qualifies as linguistic communication, not visual communication. Clearly, expressing numbers textually is useful, but they are processed differently than graphics by the brain and therefore serve different purposes. When you arrange numbers tabularly in columns and rows, however, the graphical arrangement does qualify as a form of data visualization, at least in part. For this reason, tables are classified as a type of chart.

By Ladi Omole. May 4th, 2017 at 2:07 pm

Without a doubt, the need for data visualization is needed more than ever, especially with the volume of data that is being generated.

To tell the story that brings insight into the massive data in a concise visual format surely requires hard work.

I also think the tools have improved over the years. While I will not want to favor one tool over the other, my experience in the industry has always favored Tableau as the game changer. Of course, other companies like Microsoft and Qlik woke up to the challenge with tools like Power BI and QlikView.

As the saying goes – “a fool with a tool is still a fool”. I agree with you that the human factor is primary, the improvement with these tools has also contributed to the development of the data visualization field. There are lots of good open-source and commercial tools adopting the freemium model which make the tools accessible by being affordable and available.

This accessibility is needed globally to “contribute a great deal to the world” and make the world better.

The best is yet to come.

Ladi

By Jonathon Carrell. May 4th, 2017 at 5:03 pm

As you’ve stated the meaning as remained clear and consistent, I’ll again point to your own previous definition and non-quantitative examples which you described as data visualizations. At the time you made no qualifiers about in requiring quantitative data to be considered a “data visualization”; only that quantitative data could further enrich visualizations of this type.

“Data visualization is the graphical display of abstract information for two purposes: sense-making (also called data analysis) and communication.” -Stephen Few

“Although data visualization usually features relationships between quantitative values, it can also display relationships that are not quantitative in nature. For instance, the connections between people on a social networking site such as Facebook or between suspected terrorists can be displayed using a node and link visualization. In the following example, people are the nodes, represented as circles, and their relationships are the links, represented as lines that connect them.

Visualizations that feature relationships between entities, such as the people in the example above, can be enriched with the addition of quantitative information as well. For example, the number of times that any two people have interacted could be represented by the thickness of the line that connects them.” – Stephen Few

Your current statement that non-quantitative visualizations are somehow disqualified is reflective of your current opinion and preference.

The purpose of data visualization is to enhance our understanding of a set of data … agreed. This type of enhanced understanding via data visualization can be facilitated for data sets both quantitative and non-quantitative, and yes there are wise and unwise approaches for both.

While I’ve long been supportive of your contributions to the field and share many of the same sentiments you have for so long championed, I still find your definition overly specific to the area of your greatest focus.

As to the rest of your article, I wholeheartedly agree.

Respectfully,

Jonathon Carrell

By Stephen Few. May 4th, 2017 at 5:16 pm

Jonathon,

You appear to be more familiar with my work than I am. The first quote assumed abstract information of a quantitative nature, even though this wasn’t explicitly stated, but the second quote suggests that my thinking has not been as consistent as I thought. I’m curious — where does the second quote appear in my work? Offhand, I don’t remember saying this, but I don’t doubt that I did. The vast majority of work that is classified as data visualization is quantitative in nature. It’s difficult the draw clear boundaries without some exceptions because some types of charts that usually include a quantitative component, such as network diagrams, sometimes display relationships only without anything quantitative.

By Jonathon Carrell. May 4th, 2017 at 6:02 pm

Both quotes (the definition and non-quantitative example) appear in the chapter you contributed to “The Encyclopedia of Human Computer Interaction” 2nd edition. The chapter title is “Data Visualization for Human Perception”. The online book along with your chapter are available at https://www.interaction-design.org.

In the words of Sherlock Holmes, “I never make exceptions. An exceptions disproves the rule.”

Joking aside, surely we can agree that the world is not so black and white. In my opinion, I’m convinced the inclusion of the word “quantitative” in a working definition is unnecessary. More so, is the statement that any visualization that fails to include quantitative data is somehow disqualified. This comes across as your current personal sentiment stated as fact.

Jonathon

By Jonathon Carrell. May 4th, 2017 at 6:27 pm

I would also note, your article http://www.perceptualedge.com/articles/visual_business_intelligence/our_fascination_with_all_things_circular.pdf. In it you redesigned a David McCandless chart titled “Colours in Culture”. Both the original and your redesign were purely non-quantitative. No qualifiers about it not being a data visualization are raised in the article. Although, you do point out (and rightly so) the shortcomings of the original design.

By Stephen Few. May 4th, 2017 at 6:50 pm

Jonathon,

My recreation of McCandless’ “Colours in Culture” chart is not a data visualization. I refered to it as an infographic, which is not synonymous with the term data visualization.

What I’m arguing is that a field of study and endeavor should have clear definitions. Even though it is difficult to define data visualization precisely, it is worthwhile to make the attempt. It is absolutely true that data visualization, as I understand and practice it, displays quantitative data. Many visualizations that don’t include quantitative data are closely related to data visualization, so the boundaries can be a little fuzzy at times, but that isn’t a reason to abandon the emphasis on quantitative data.

(By the way, thanks to your comments, I’ve revised my original blog post to say that data visualization, for me, has remained “fairly” clear and consistent in meaning and purpose.)

By Jonathon Carrell. May 4th, 2017 at 7:57 pm

I’m not disagreeing that a working definition would have value or purpose. I’m stating that in my opinion making overcritical designations for which we may later make exceptions or allow for fuzzy boundaries is wholly unnecessary.

While the majority of useful visualizations are typically quantitative in nature, it isn’t always the case. Typically is the key word here. Adding a numeric value to an otherwise qualitative diagram (such as the example you provided) doesn’t in itself change its form, but rather enhances the context of its content. Even without numbers, it is still data and it is still a visualization.

This is a fruitless debate of semantics.

I leave you to consider this as an alternative:

Data visualization is the graphical representation of data to facilitate understanding and communication.

P.S.

Instead of focusing on whether the data is quantitative, why not use a different adjective … such as “useful representation”? Then we could further disavow pies, bubble charts, word clouds, and radial gauges. [insert maniacal laughter here]

By Stephen Few. May 5th, 2017 at 8:26 am

Jonathon,

Semantics actually matter a great deal. The person who came up with the expression “It’s only semantics” deserves his own special place in hell. When we define a field, we need to draw boundaries somewhere. If we define data visualization as you’ve suggested, we open the door to an array of graphical displays that we don’t currently think of as data visualizations, including ER data diagrams, illustrations of all types, comic books, and all representational forms of visual art. The point of a definition is to enable shared meaning between people and over time. Despite the edge cases, drawing the boundaries of data visualization as quantitative displays bases the definition on a distinction in brain function, and it also fits what people usually think of as data visualization. Adding quantitative data to a qualitative visual display actually does change its form. We represent categorical items quite differently than quantitative values. The graphical mechanisms are different. Quantitative displays rely on visual attributes (2-D position, length, size, etc.) that our brains interpret quantitatively (greater or lesser). These differences provide a natural boundary between data visualization and other forms of visual display.

I’ve limited my work (but not interest) to quantitative displays. Doing this has been quite useful.

By Stephen Few. May 5th, 2017 at 8:40 am

By the way, I’m not arguing that the definition of data visualizaton that I’ve proposed above is right and every other definition is wrong. Instead, I’m arguing tht this definition makes sense and creates clairity that would benefit the field.

By Jonathon Carrell. May 5th, 2017 at 10:09 am

I wasn’t implying semantics don’t matter. Rather, by fruitless, I meant we have no way to prove our positions as being correct or incorrect. My position is that regardless of the type of data being represented (being either qualitative or quantitative), if it is rendered in a graphical format (e.g. A chart or diagram) in such a way that it facilitates greater understanding … that qualifies as data visualization. I have made no suggestion that comic books or other artful works should be considered data visualization. I agree that data visualization is typically geared towards quantitative endeavors. However, there are fields of data analysis that utilze visualizations that are geared towards understanding non-quantitative relationships.

I’m not suggesting opening some Pandora’s box. Rather, I’m suggesting any definition should allow for there being meaningful ways to graph and chart other types of data beyond the quantitative restrictor you’ve suggested.

In any case, I believe we’ve both adequately presented our positions and reasoning.

“In the beginner’s mind there are many possibilities,
but in the expert’s there are few.”

Jonathon

By Stephen Few. May 5th, 2017 at 10:27 am

Jonathon,

Your final quote doesn’t apply to this situation or to me. You used it to suggest the superiority of your position — one that is open to possibilities while mine is supposedly closed minded, which isn’t the case. Closed-mindedness is not a characteristic of expertise, it is a flaw in reasoning. Your position could be every bit as closed minded as mine, but I’ll assume that neither of us is exhibiting this flaw. As I’ve said, this isn’t about being right or wrong. I’m arguing a position that I believe is useful. You’re certainly welcome to disagree.

The definition that you suggested does potentially include comics and visual art as examples of data visualization. You proposed the following definition: “Data visualization is the graphical representation of data to facilitate understanding and communication.” The term “graphical” does not refer only to charts and graphs. The term data refers to facts of all kinds. Comics and visual art can facilitate understanding and communication regarding facts. I’m proposing a definition that narrows the scope of data visualization to quantitiative data in an attempt to avoid the open-endedness and thus, lack of clarity, that your definition invites.

By Jonathon Carrell. May 5th, 2017 at 12:39 pm

I meant no slight by the quotation, nor did I intend for it to be taken as a device to bolster my position or opinion.

I find your earlier definition perfectly acceptable.

“Data visualization is the graphical display of abstract information for two purposes: sense-making (also called data analysis) and communication.” -Stephen Few

You have felt a need to further narrow the scope whereas I do not. While I certainly respect your reasoning and position, I will continue to disagree that the qualifier is necessary as it implies a rule that in my mind doesn’t exist.

Good day sir.

By Dale Lehman. May 7th, 2017 at 9:39 am

Back to your original post – I agree and lament that fact that it equally applies to statistical analysis, data “science” (notwithstanding your critique of the term), big data, or any of the related topics. Tools have replaced understanding. I actually like good tools quite a bit, but when you read the job ads you would think that knowledge of particular computer languages, programs, etc. constitutes good data sensemaking. I attribute this to the fact that it is (fairly) easy to document knowledge of tools (hence job applicants and employers emphasize these) but difficult to document the ability to make sense out of data. I find students have the same bias – they want to be able to say they know how to use A, B, C,… because they are not confident that they know how to use data to make better decisions (or don’t know how to convey that to employers).

As the tools multiply, I wonder if this situation will become worse or better. There are more tools to be exposed to and list on resumes and job ads, but at some point people will (?) start to realize that knowing a tool does not mean you know how to use data to improve decisions. Just as knowing a foreign language does not make you able to be successful in business in another country (though, of course, knowing the language can help).

By Kelvin Lim PH. May 7th, 2017 at 8:42 pm

Hi Jonathon,

Just curious and seeking your clarification on the point you are trying to make quoting this:

“Although data visualization usually features relationships between quantitative values, it can also display relationships that are not quantitative in nature. For instance, the connections between people on a social networking site such as Facebook or between suspected terrorists can be displayed using a node and link visualization. In the following example, people are the nodes, represented as circles, and their relationships are the links, represented as lines that connect them.

Visualizations that feature relationships between entities, such as the people in the example above, can be enriched with the addition of quantitative information as well. For example, the number of times that any two people have interacted could be represented by the thickness of the line that connects them.” – Stephen Few”

You quoted Stephen stating that quantitative values can be encoded such as the thickness of the line that connects them. (Other quantitative values can include size of nodes or the distance of the nodes from each other – non-exhaustive).

In you first paragraph, are you suggesting that the node and link visualization such as Facebook/Terrorist network is not quantitative in nature? I would think the number of nodes (connections) is quantitative by itself even without the thickness of lines that connects them, although those would be most useful to make better sense of the data.

By Jonathon Carrell. May 8th, 2017 at 8:30 pm

@Kelvin

Node and link diagrams, at least in the field I work in, are typically used to visualize relationships so that connections that may not have been apparent are easier to recognize. These are relationships are qualitative in nature.

Sure, we can count the number of nodes/relationships and call it quantitative, but that isn’t an effective goal of this type of chart as a table or simple bar graph would serve better. As Stephen pointed out, there are several ways to significantly enrich these types of visuals by also encoding quantitative information, but calling them quantitative in and of themselves seems a stretch to me. What say you Mr. Few?

By Angel Macaluso. May 9th, 2017 at 7:52 pm

Very interesting article. I am an RN working in a Data Analytics & Informatics Department in a Hospital system. I create data visualizations utilizing Tibco spotfire. I do not have a background in Data or Analytics. However, I understand the data and the workflows that impede or impact the data. I tried to learn more about the art of data visualization. Requested to go to a TDWI conference this year. I was told no by the VP of our department… “that was not my role. Its not about the graph you choose, its about knowing your audience.” Although it is incredibly important to know your audience, there are skills I can learn to better present and sell my story to my audience. I appreciate your article! I know I am on the right path!

By Stephen Few. May 9th, 2017 at 10:53 pm

Angel,

The VP of your department is half right. To visualize data effectively, you must know your audience, but you must also know data visualization best practices. Don’t expect to learn about data visualization by attending TDWI, however. As far as I know, no one who understands data visualzation has worked with TDWI since I stopped working with them several years ago. The best courses in data visualization are taught independently, not through large organizations such as TDWI.

By rjss. May 14th, 2017 at 9:55 am

This will require a full book but I think part of the problem is our “obsession” to group data as quantitative and qualitative. This is quite a bit of a gray line. To mention an example; imagine a flow chart or network. These can actually be represented quantitatively (matrix of numbers) which is the reason we can use tools to make analysis in them or visually. Some people may consider one quantitative data while others may consider it qualitative. In this case I will say that using “quantitative” in the definition is not necessary.

Leave a Reply